Mann-Whitney U-Test critical values for very large samples

Question

I am trying to perform Mann-Whitney U-Test on rather large samples (N=53). Is anybody aware of any resource for critical values for n>30, please? Thank you.

score 5 · Answer 1 · edited Sep 14 '13 at 19:17

5

See this Wikipedia article. For large samples (N > 20) U is approximately distributed as $z = \frac{U- m_U}{\sigma_U}$ where

$m_u = \frac{n_1n_2}{2}$

and

$\sigma_U = \sqrt{\frac{n_1n_2(n_1 + n_2 + 1)}{12}}$

edited Sep 14 '13 at 19:17

answered Sep 14 '13 at 18:13

Peter Flom

119,535
36
175
383

score 3 · Answer 2 · answered Sep 14 '13 at 19:19

Others have mentioned that the normal approximation is pretty good for the large sample sizes.

Another option is to use the fact that the MW test is a special case of a permutation test, the original tables were constructed by looking at all the possible permutations for the given sample sizes. So for your sample sizes you can just enumerate the total number of permutations that give more extreme values. However, note that the number of combinations can increase very quickly, for example with 2 samples of size 15 (30 total) there are 155,117,520 possible combinations, doable with modern computers, but not quick. This turns out to actually be easier with unbalanced samples. If one sample has 35 and the other 5 then there are only 658,008 possible combinations.

You can also get an approximate value by sampling. Randomly permute the values/ranks between the 2 groups and compute the stat, repeat a bunch of times (9,999 or so) and use that information to compute your p-value or critical value. More permutations will give you more precision.

But for most cases this will not give much of a difference from the normal approximation.

score 1 · Answer 3 · answered Sep 14 '13 at 18:11

1

For sample sizes above 20 using the normal distribution is fairly good approximation.

Check out the Wikipedia page for more about it:

http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Calculations

answered Sep 14 '13 at 18:11

Thank you. Just one more question, and sorry for my ignorance - "using the normal distribution is fairly good approximation" means that I ought to use the parametric analogue, i.e. the Student's t-test? Thank you. – Vojtech K Sep 14 '13 at 18:54
1

No, it means that once you have calculated the test statistic then your test statistic should follow a normal distribution. As @Peter said your test statistic should be: $$U^*=\frac{U-m_u}{\sigma_u}$$ – Sep 14 '13 at 18:57

Mann-Whitney U-Test critical values for very large samples

3 Answers3