In Statistics, what is a rank sum? How is it calculated? And what is it useful for?
Let me start with an example. We have the following observations about 2 populations:
Population A: 8.50, 9.48, 8.65, 8.16, 8.83, 7.76, 8.63 (nA=7)
Population B: 8.27, 8.20, 8.25, 8.14, 9.00, 8.10, 7.20, 8.32, 7.70 (nB=9)
We would like to test whether the two populations have different means or if any difference is purely the result of chance error due to our particular samples used. If it can be assumed that the data points come from a normal distribution, we could use the well known two sample t-test to check whether the means of the two populations are the same. In particular, I state that we want to test the following:
H0: μA= μB vs. H1 μA≠ μB (a two tailed test!)
Most students (and practitioners alike!) make the assumption of normality because it simplifies their lives. But, often normality does not hold (in particular for small samples) and using it amounts to use wrong procedures to test hypotheses. This is where non-parametric techniques come in. And, in this particular case, the (Wilcoxon) rank sum test is your technique.
How does it work?
1) order the observations for the two populations from smallest to largest
2) assign to each observation its rank, i.e. 1 to the smallest observation, 2 to the second smallest, and so on.
In our case we have nA+nB = 7+9 = 16 observations so we will assign ranks from 1 to 16 to our observations ( I put in bold face the observations from population B and the associated ranks as well)
7.20, 7.70, 7.76, 8.10, 8.14, 8.16, 8.20, 8.25, 8.27, 8.32, 8.50, 8.63, 8.65, 8.83, 9.00, 9.48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
3) sum the ranks for population A, (the ones not in bold) i.e.,
Rank-sum A= 3+6+11+12+13+14+16 = 75
4) If H0: A=B is true, we would expect that our 17 observations are very interspersed. Hence the problem is to decide if seeing a value like 75 for the sum of the ranks is enough evidence against H0.
(what is the min value for the rank sum? What is the max value for the rank sum? Very simple.. assume that all observations are the lowest 7 ranks or the highest 7 ranks ....this will give you the possible range of the value for this particular rank sum)
5) We need to use special tables. These tables are built under the assumption that H0 is true and for many different sample sizes, nA, nB and, after considering all possible cases, the range of possible values for the rank-sum of one population are calculated (for you! Hooray!) and you can look at the percentiles that were obtained.
You would look in your table (which you must get from a book!): find nA=7 and nB=9.. and decide on a significance level that suits your problem, say 95% so that α=5% (since you have a two-tailed test you would look for 2.5% on each side!) and find the lower and upper tail from the table, i.e., in this case you would find: low-tail = 40, upper tail = 79.
6) You value of the rank-sum falls inside this interval! So at least for your significance level you fail to reject H0 in favor of H1.
Why do we bother we all this work? Well, this test is valid even if the data do not come from a normal distribution and it is also much less sensitive to outliers than the two sample t-test. For homeworks this may very well be just a nuisance, but in real-life applications where precision is paramount, this is great news.
7) I only gave you the flavor of the method. We should deal with what to do if two or more observations have the same rank, whether for large nA and nB we can use some simpler and possibly normal approximation and so on, but then things become very technical and I will stop here. I suggest you get yourself a good book on non-parametric statistics from a library and peruse it. While there is generally little knowledge about these methods outside of a community of professional statisticians, they do serve a purpose!
Hope this gives you some flavor for the topic.