Asked • 08/13/22

Anscombe's Quartet (1973)

A problem involving this dataset appeared recently in one of my student's assignments. You can try it too.


1.) Google "Anscombe", "quartet", and "kaggle." This will lead you to a site where you can download a CSV of the dataset. The variable labeled, 'x123' serves as the "x' variable for the first 3 examples.


2.) Run descriptive stats, correlations, and bivariate regressions for the 4 variables. (y1 = x1, y2 = x2, y3 = x3, y4 = x2)


3.) You will find that the descriptive stats, correlations, and regression equations for the 4 problems are nearly identicial. Let's dig deeper.


4.) Make scatterplots for each of the 4 problems. What do you see?


5.) Hints:

  1. problem 1. The only OK one in the set
  2. problem 2. Add a particular transformation of x2 to your original equation and you'll perfectly explain the data. new question y = x2 + transformed_x2
  3. problem 3. Remove one x3,y3 pair and obtain a perfect prediction
  4. problem 4. Remove on x4, y4 pair and discover the case is hopeless



2 Answers By Expert Tutors

By:

Konopelski B. answered • 08/14/22

Tutor
5 (2)

Statistics, Finance, SPSS, SAT Math and physics Tutor

Still looking for help? Get the right answer, fast.

Ask a question for free

Get a free answer to a quick problem.
Most questions answered within 4 hours.

OR

Find an Online Tutor Now

Choose an expert and meet online. No packages or subscriptions, pay only for the time you need.