
Patrick B. answered 12/30/18
Math and computer tutor/teacher
You must have the same number of men and women sampled in the data.
Remember, these are supposed to be ordered pairs on the scatter plot.
Since there are 20 stats, you have 10 data points (X,Y) where X is the
man's salary income and Y is the women's salary income.
You incorrectly labeled some of the stats MALE instead of FEMALE.
So you must figure out which ones are incorrectly labeled. Once you do
you can go to the following website, input the data, and the calculator will
find the R correlation coefficient for you, per the formula.
https://www.socscistatistics.com/tests/pearson/Default2.aspx
One option that can instead be explored here is to do a hypothesis test comparing
the MEAN incomes of MEN vs. WOMEN.
Null hypothesis is that there is no difference between the means.
Alternative hypothesis is that there is a difference in the means
For the men, the mean is 658.33 with a variance of var1=24924.24 and standard deviation 157.8741.
For the ladies, the mean is 593.75 with a variance of var2=33883.93 and standard deviation 184.0759
N1=12 and N2=8 of course.
The standard error = sqrt ( Var1/N1 + Var2/N2)=79.45131
the test statistic is (mean1-mean2)-D / SE
where D is the hypothesized difference, which in this case is D=0
So test stat t = (658.3 - 593.75)/ 79.45131
= 0.812867
That test stat does not lie within the rejection region at
ANY level of confidence, the conclusion is that there is
no difference of the means.
Normal distribution is assumed, despite there only being
N=20 < 30 statistics in the sample. The T-table with 30-20=10
degrees of freedom shows the same results, although the
official formula requires a much more tedious calculation
for the degrees of freedom.
Please repost
.