Stan S. answered 11/29/17
Tutor
4.9
(137)
Get a Good Handle on Statistics with Patient, Encouraging PhD
Hi Sarah,
To create a modified boxplot, we first find the median (50th percentile) of a set of data points. We then find Q1 (25th percentile) as the median of points less than the central median, and Q3 (75th percentile) as the median of points greater than the central median.
A normal distribution is continuous and doesn't have medians or individual data points. However, if we interpret boxplot percentiles as the equivalent cumulative areas under the curve of a standard normal distribution, then we can find z scores for each of the boxplot quartile percentages and use those as our Q1 and Q3 data points. Given those we can then find the interquartile range as IQR = zQ3 - zQ1. Knowing that we can find z scores for the upper and lower outlier cutoff points as zUO = zQ3 + 1.5*IQR and zLO = zQ1 - 1.5*IQR. The answer of the probability of a randomly selected value being an outlier then becomes the sum of the areas of the standard normal distribution to the left of zLO and to the right of zUO.