Correlation vs. Causation: Why do we care about the difference?

Many statistics students have likely heard their instructor say the phrase, "correlation does not mean causation."  As a statistics student, you may have even recited this phrase during a discussion or included the phrase in a written assignment.  But what does this phrase truly mean and why do we care?
Consider this scenario:

A researcher wants to examine the impact that watching the show "The Jersey Shore" has on IQ test performance for a class of Introductory Statistics students. To investigate the relationship, the instructor surveys the classroom and asks students to 1) answer some questions that will be used to calculate their IQ and 2) indicate the amount of time (in hours per week) they spend watching The Jersey Shore. The instructor compares the responses for hours spent watching The Jersey Shore to student performance on the IQ test and finds a strong, negative relationship, r(30) = -.45. Remember, a strong negative relationship here means that as the number of hours spent watching The Jersey Shore increases, IQ scores decrease.

"A-ha!" the instructor exclaims. "This is why you should not watch that horrible television show. Watching The Jersey Shore is causing your intelligence to decrease!"

Is this instructor correct? Should the government put a stop to airing this horrible television show? Do these results demonstrate that watching The Jersey Score causes lower intelligence scores, in essence making people less intelligent?

The answer is no.

Here is a CLASSIC case of correlation incorrectly being termed causation. Why? Although the instructor claims that The Jersey Shore causes low IQ scores, isn't it equally as likely that having low IQ scores cause people to watch The Jersey Shore? With the experimental design chosen by the instructor (i.e., obtaining IQ and hours spent watching The Jersey Shore simultaneously), this is definitely possible, and we have no way of knowing which one causes the other.

What we do know is that there is a relationship, thus the variables are correlated, but a different study design would be needed to have a better shot at supporting a claim for causation.

Although this example may seem trivial, it is important to recognize this flawed logic because it is used by politicians, exercise programs, new diet plans/pills, pharmaceutical sales people, etc. To be able to infer causation from an experiment takes some creativity, planning, and knowledge!

How might you design an experiment to have a better chance at supporting the idea that The Jersey Shore causes decreases in intelligence?


Robert B.

PhD Organizational Research Consultant with Statistics Expertise

20+ hours
if (isMyPost) { }