Marla G. answered 06/16/19
Masters Degree in Applied Statistics with 20+ Years of Work Experience
When outliers exist in the dataset, the median is the appropriate statistic to estimate the 'middle' or average of the distribution. This is because when an outlier is present, is can have a dramatic effect on the mean, yet has little to no effect on the median. Consider the following simple example:
Suppose we have the following dataset A: {1 2 3 4 5},
then an error is discovered that changes the dataset to B: {1,2,3 4,25}
the median of both datasets is 3 (the middle observation, however the mean shifts by 4):
mean of A=1+2+3+4+5/5=15/5=3
but the mean of B=1+2+3+4+25/5=35/5=7
It's interesting to note:for dataset A (no outlier), the mean and the median are the same, but in dataset B (has an outlier)the median is 3 (unchanged), but the outlier influences the estimate of the mean from a value of 3 to a value of 7!!