Hi Chris,
Great question! At the core of your statistical question is the idea of "prediction" which is important in giving you a hint of what test you might want to run. In this problem, your predictor variable is grade and your outcome variable is Trying1 in 3015.
Since these are both continuous variable (predictor and outcome), you would want to use a simple linear regression. I am not sure if you have within your course yet learned about testing assumptions before choosing your statistical test but to quickly outline this, some things that you would want to consider before making this decision of test are:
- Measurement of variables (I am not completely sure about your question how Trying 1 is measured so be sure to check this)
- Normality (how are these variables distributed, if normality distributed this would justify using non-parametric testing and you would switch to spearman's p instead of pearson r. To determine normality you could use shapiro-wilk or kolmogorov-smirnov (depending on sample size), Q-Q plot visual inspection, skew, and kurtosis. Please let me know if you have questions on these specifically!
- Linearity and homoscedasticity (equal variance of residuals through residuals plot)
You mentioned a couple of other tests and here is why these would not make sense:
- Independent t test: this compares two group means without prediction
- Wilcoxon signed rank: this is a non-parametric test used for non normally distributed variables to see differences pre post (similar to paired sample t test)
- Correlation: this is association between two variables, not prediction
- Stepwise regression: this requires multiple predictors, where in your example you only have one.
To give insight on some of your other questions:
How accurate is this information? You will want to look at the standard error of the estimate equation for this, with smaller values indicating more accurate of a prediction (due to less prediction error)
What is the .95CI for the Trying1 and grade correlation? Look at the equation for Fisher's r-to-z transformation and see if you can plug in your numbers to this
State the coefficient of determination and interpret the meaning of this value: Coefficient determination will be your r^2. For example if r^2 is .20 then you would be able to interpret that 20% of the variance in Trying1 scores is accounted for by 2nd year statistics grades (and the remainder is other factors that were not included in your model but are confounding variables)
Feel free to reach out if you have any additional questions or need help walking through any of these steps more specifically!