Pair-wise Mann-Whitney U vs. ANOVA

Question

I'm having trouble interpreting the results of comparing 4 sequences. All are fairly similar (in terms of mean) and I want to figure out if one can say that they that their difference in mean is statistically not significant. Even though they look normally distributed, I originally did not assume so. I compared sequences pair-wise (adjacent ones wrt. their mean values) using Mann-Whitney U ranked sum test. Assuming a significance level of alpha=0.05, two of the three comparisons failed to reject H0, thus that the difference in mean was statistically not significant. For the third comparison (c vs. d), I obtained a p-value of ~0.007 which led me to believe that the difference is statistically significant.

Now, I thought I could actually come to the same conclusion using ANOVA, but for some reason I cannot. The p-value I get from doing a one way ANOVA is ~0.06 which, given my alpha=0.05, would still mean H0 could not be rejected (i.e. according to my interpretation that those 4 sequences are drawn from the same population, i.e. don't differ).

Why would the pair-wise Mann-Whitney U test and ANOVA come to a different conclusion, or is this p-value=~0.06 about the deviation one can expect (i.e. value is close to < 0.05)?

Here are my tests using scipy:

a = np.asarray(
[-0.00913474, -0.00451146, -0.00572025, -0.00499671, -0.00517387,
 -0.00211881, -0.0061153 , -0.00872848, -0.00420962, -0.0049735 ,
 -0.01087468, -0.0038722 , -0.00515165, -0.00710934, -0.00563324,
 -0.00412264, -0.00149319, -0.00347968, -0.00394813, -0.        ,
 -0.00191736, -0.0068053 , -0.00121973, -0.00429468, -0.00562144,
 -0.00546447, -0.00815879, -0.00619868, -0.0028855 , -0.0051365 ]
)

b = np.asarray(
[-0.0089    , -0.00768003, -0.00684355, -0.00618331, -0.00167988,
 -0.00377895, -0.00582698, -0.00569348, -0.00397963, -0.00516734,
 -0.00561272, -0.00456083, -0.00400906, -0.00625585, -0.00028588,
 -0.00534019, -0.00456945, -0.00635621, -0.00595439, -0.00858971,
 -0.00798386, -0.00616486, -0.00118706, -0.00879095, -0.00409645,
 -0.00788892, -0.00389249, -0.00603958, -0.00600114, -0.00038731]
)

c = np.asarray(
[-0.01134304, -0.00268032, -0.00672718, -0.00558661, -0.00483106,
 -0.00716351, -0.00547393, -0.00577063, -0.00946544, -0.00303906,
 -0.00506249, -0.00255743, -0.00165606, -0.0046987 , -0.00441018,
 -0.00793861, -0.01051742, -0.00409939, -0.00668221, -0.00498903,
 -0.00480866, -0.00365159, -0.00432343, -0.00667153, -0.00142702,
 -0.00367953, -0.00732742, -0.0050567 , -0.00375138, -0.00444236]
)

d = np.asarray(
[-0.00431811, -0.00715759, -0.00564569, -0.00999856, -0.00599809,
 -0.0056927 , -0.0063155 , -0.00631829, -0.00456755, -0.00799021,
 -0.0067991 , -0.00614617, -0.01040599, -0.00770959, -0.00422278,
 -0.00958456, -0.00255951, -0.00399078, -0.00646359, -0.00688078,
 -0.00539324, -0.00721919, -0.00713238, -0.00720869, -0.00741503,
 -0.00532616, -0.00656618, -0.00626103, -0.00697652, -0.00572527]
)


print 'a vs b', scipy.stats.mannwhitneyu(a, b)
print 'b vs c', scipy.stats.mannwhitneyu(b, c)
print 'c vs d', scipy.stats.mannwhitneyu(c, d)
print 'anova', scipy.stats.f_oneway(a, b, c, d)

Output:

  a vs b MannwhitneyuResult(statistic=383.0, pvalue=0.16276329338957607)
  b vs c MannwhitneyuResult(statistic=406.0, pvalue=0.2600723060808019)
  c vs d MannwhitneyuResult(statistic=283.0, pvalue=0.0069158099014877067)
  anova F_onewayResult(statistic=2.6042369485095125, pvalue=0.055212539698250386)

I understand this is an old post, but since it popped up as "active", I think it's worth commenting on. There is some heteroscedasticity in these data. Because of this, a traditional anova is probably not the best approach. If one uses an anova with heteroscedaistic-adjusted results or Welch's anova, the resultant p value is about 0.02. If an appropriate post-hoc is used (Games-Howell, or E.M. means-based), it is revealed that group A is different from group B. — Sal Mangiafico, Dec 30 '19 at 17:23

score 0 · Answer 1 · answered Mar 16 '17 at 23:44

0

I resorted to a Friedman test (instead of ANOVA) which gives a similar result as the pair-wise Mann-Whitney U test (i.e. d is not the same distribution as the other 3). Perhaps non-parametric tests would be preferred in this case...

answered Mar 16 '17 at 23:44

orange

121

The Friedman test is used only in cases of complete block design. This is not the same design as would be used in the case of pairwise Mann-Whitney tests (which don't take into account any kind of blocking). – Sal Mangiafico Dec 30 '19 at 17:06

Pair-wise Mann-Whitney U vs. ANOVA

1 Answers1

Linked