Statistics: Linear Regression

Question

You wish to determine if there is a positive linear correlation between the age of a driver and the number of driver deaths. The following table represents the age of a driver and the number of driver deaths per 100,000. Use a significance level of 0.05 and round all values to 4 decimal places.

Driver Age # of Driver Deaths per 100,000

30 33

73 25

67 23

39 19

21 23

32 32

65 20

Ho: p=0

Ha: p>0

Find the linear Correlation Coefficient and the p-value.

Nicholas T. · Accepted Answer

We first need to find the linear correlation coefficient, r, between driver age and driver deaths per 100,000. This can be done using technology, such as a graphing calculator or Excel. I will use Excel, plugging the variables into separate columns, then using the CORREL formula to obtain the correlation coefficient. This results in:

r=-0.390.

We then want to find the test statistic for a hypothesis test for significance of correlation. This is calculated as:

t=r*sqrt[(n-2)/(1-r^2)]

Where r=the correlation coefficient and n=the number of data pairs. Plugging in r=-0.390 and n=7, we get a test statistic of -0.947.

The last thing to do is use the test statistic to find the p-value. Because this is a right-tailed test (Because we have > in the alternative hypothesis), the p-value is the area to the right of the test statistic. The test statistic follows a t distribution, with n-2 degrees of freedom. We can find this p-value in Excel by using the following formula:

=T.DIST.RT(-0.947,5)

Where -0.947 is the test statistic and 5 is the degrees of freedom. This results in a p-value of 0.8064. This is not less than the significance level of 0.05, so we would fail to reject the null. We cannot conclude that the correlation is positive (and since the sample correlation r was negative, we would never conclude that it was positive).

Let me know if you have any questions or if anything is unclear!

Michael D. · Answer

If you plot the data (which is a habit you should strive to develop, even if most exercises like this one will let you get away without doing so), there appears to be no relationship between the two variables.

Since we now have very powerful computational technology available, you should never have to compute a correlation by hand. If you don't have a preference, this is a good choice (and will also plot your data automatically):

https://www.desmos.com/calculator/kg2qmphj2s

The explanatory variable is Driver Age; enter those values in the x₁ column. The number of deaths per 100,000 is the response variable; enter those values in the y₁ column. The values of r² (Determination) and r (Correlation) will be shown in the cell below the table. Note that since r is negative, your P-Value for a right-tailed alternative is guaranteed to be greater than 50%, so you wouldn't really need to compute it...

...but since the question requires it, first find the student T-score for this sample, using the formula:

t = r*sqrt((n-2)/(1-r^2))

where n is the sample size (number of pairs, which is 7 for this sample) and r is the correlation found above. You should get t = -0.9471.

Finally, use the T-distribution calculator of your choice (with df = n - 2 = 5) to find the area to the right (same direction as the alternative hypothesis) of the above T-score. You should get .8065 = 80.65%. This is your P-Value.

Statistics: Linear Regression

2 Answers By Expert Tutors

Still looking for help? Get the right answer, fast.

OR

RELATED TOPICS

RELATED QUESTIONS

How do I create a probability model?

Statistics with chi square

I need a creative ideas

statistics

statistics

RECOMMENDED TUTORS

IXL

Rosetta Stone

Education.com

TPT

Vocabulary.com

ABCya

SpanishDictionary.com

Inglés.com

Emmersion

Statistics: Linear Regression

2 Answers By Expert Tutors

Still looking for help? Get the right answer, fast.

OR

RELATED TOPICS

RELATED QUESTIONS

How do I create a probability model?

Statistics with chi square

I need a creative ideas

statistics

statistics

RECOMMENDED TUTORS

find an online tutor