Yosef T. answered 01/07/20
RPI Ph.D. Math/Physics Tutor with a passion for teaching
The first step here is making sure we know what the variables and terminology mean:
- n is the total number of people that were sampled. In this case, they give us n = 150, which means that 150 people were sampled.
- x is the number of "successes" (such as people answering "yes" to the question). In this case, x = 45, which means that there were 45 successes out of the 150 total population.
- We need to know what a confidence interval is. True, we sampled 150 people and got 45 "yes" answers, but the doesn't mean that exactly 45 out of any 150 people would answer "yes". We want to know the percentage of the population that would answer "yes" if we were to ask them, but unfortunately, we can't know the exact number for sure. We want to come up with a range of possible percentages that we think will be right 95% of the time.
The first calculation that we do is just finding the expected percentage. We can do this by just dividing the number of successes x by the sample population n. (Hopefully, this is somewhat intuitive.) In math symbols, we write:
μ = x/n.
Here, the Greek letter μ denotes the average proportion.
Next, we should find the standard deviation, σ, of the proportion. We find the standard deviation with the equation:
σ = sqrt(μ * (1 - μ)/n).
This formula for the standard deviation may not make sense or be intuitive at the moment, but the explanation of where that formula comes from is more complicated than the solution to the rest of the problem.
Finally, we have the formula for the lower bound and upper bound of the confidence interval.
Lower Bound = μ - 1.96 σ
Upper Bound = μ + 1.96 σ
Those last two equations come from the fact that if a random variable follows a bell curve, it will be more than 1.96 standard deviations below the mean 2.5% of the time and more than 1.96 standard deviations above the mean 2.5% of the time. This means that 95% of the time, it will be between the given lower bound and upper bound.