Interpreting the results of statistical data

Question

A sample of 100 new born babies showed 15% underweight. Write the 95% confidence interval for the population proportion π. Interpret your result.

David W. · Accepted Answer

To write a confidence interval for a population proportion, we need two things: an estimate of the proportion and to compute how imprecise that estimate is. Fortunately, both of those things can be easily derived from two pieces of information: 1) the sample proportion of 0.15 (or 15%) and 2) the sample size of 100.

Step 1: Estimate the population proportion using the sample data

Easy: 0.15 or 15%. For this question, our only information comes from the sample itself. That's all we've got. If the sample is randomly drawn from the entire population, that estimate is probably not bad. If the sample is not randomly drawn, we don't know how good it is, but ... still ... it's all we've got.

Step 2: Compute the standard error of our estimate.

Less easy, but not horribly hard. We need to use the formula √(p(1-p)/n) using our estimate from step 1 as p, and the sample size of 100 as n. That leaves: √(0.15(1-0.15)/100) or √(0.1275/100). Computing inside the parentheses, we get √(0.001275). Finally, taking the square root, we get approximately 0.0357. That's the standard error (SE) for our estimated population proportion. .

Step 3: Build the confidence interval

For the confidence interval, we have to compute something to add to and subtract from the estimated proportion to form the interval (p-something, p+something). In this case, something = z * SE. The z comes from the standard normal distribution and tells us how many standard errors from the middle we have to go to get an interval that, for our case, covers 95% of the answers we'd get if our proportion estimate is the correct estimate. In this case, it's 1.96. So, something = 1.96*0.0357 = 0.0700. That means our confidence interval is (0.15 - 0.07, 0.15 + 0.07) or (0.08, 0.22).

Step 4: Interpret

This confidence interval we computed means that, based on the best (and only) information we have (i.e. that the population proportion is actually 0.15), we'd expect repeated sample to get answers between 0.08 and 0.22, 95% of the time. More loosely, that means that, if someone else repeated the study and came up with an estimate of 0.1, we would shrug our shoulders and say, "Yeah, that's close enough to what we got, already." On the other hand if someone got an answer of 0.3, we'd have some concern that our study or their study have something wrong or, at least, something meaningfully different.

Warnings:

This answer uses the normal approximation which is just an easy-to-calculate approach to the needed confidence interval. It is not necessarily the best. However, as sample sizes get larger, the normal approximation and other "better" confidence intervals get closer and closer to the same answer. In this case, with 100 subjects, the normal approximation is probably somewhere between OK and good. It's not amazingly good, but it isn't bad either.

Interpreting the results of statistical data

1 Expert Answer

Still looking for help? Get the right answer, fast.

OR

RELATED TOPICS

RELATED QUESTIONS

Haemoglobin level of pregnant women follow a normal distribution. If 40 out of 1000 women and 200 of 1000 have haemoglobin less than 9 and 10 respectively.

Use linear regression to analyze the relationship between time since introduction and the natural log of snake population size.

Statistics HELP!!!

Bio Stat Question:

if 30% of children age 2-6 have high cholesterol, and 12 are analyze what is the probability that exactly 3 have high cholesterol

RECOMMENDED TUTORS

IXL

Rosetta Stone

Education.com

TPT

Vocabulary.com

ABCya

SpanishDictionary.com

Inglés.com

Emmersion

Interpreting the results of statistical data

1 Expert Answer

Still looking for help? Get the right answer, fast.

OR

RELATED TOPICS

RELATED QUESTIONS

Haemoglobin level of pregnant women follow a normal distribution. If 40 out of 1000 women and 200 of 1000 have haemoglobin less than 9 and 10 respectively.

Use linear regression to analyze the relationship between time since introduction and the natural log of snake population size.

Statistics HELP!!!

Bio Stat Question:

if 30% of children age 2-6 have high cholesterol, and 12 are analyze what is the probability that exactly 3 have high cholesterol

RECOMMENDED TUTORS

find an online tutor