4

The Mann-Whitney test requires homogeneity of variance if a median difference is suppossed to be statistically significant.

In case homogenity of variance is not met, but the test is significant: Which aspects of the test can I report?

Ferdi
  • 5,179
  • 1
    See also http://stats.stackexchange.com/q/113334/3277 - a very similar question with interesting answers and comments. – ttnphns Aug 27 '14 at 13:40

1 Answers1

4

You can interpret the $U$ (rank sum) test as a test for stochastic dominance. In such a case, the null hypothesis is not H$_{0}\text{: }\tilde{\mu}_{A} = \tilde{\mu}_{B}$ (i.e. equal medians), but H$_{0}\text{: P}\left(X_{i} > X_{j}\right)=0.5$ for all $i,j \in \{1,\dots,k\}$ for $k$ groups, assuming (per Scortchi's comment) that the CDFs do not cross (i.e. there is stochastic equality among all groups), and H$_{\text{A}}\text{: P}\left(X_{i}>X_{j}\right) > 0.5$ for at least one $i \ne j$.

Failing to reject the null in such a case means you found no evidence of stochastic dominance. Rejecting the null in such a case means you did.

Alexis
  • 29,850
  • 3
    It is important to not compute the $P$-value in a way that assumes the two distributions are identical under the null hypothesis, in order to obtain a powerful test for stochastic dominance. One way to do this is to use the general $U$-statistic standard error as computed in the R Hmisc package's rcorr.cens function. – Frank Harrell Aug 07 '14 at 15:54
  • Thank you both very much for your answers, really appreciate your knowledge! @ Alexis: What would I report if I found evidence for stochastic dominance? (e.g. is it valid to report the U-statistic?) @Frank: Unfortuntely, I am using SPSS and am unfamilir with R. Is there any chance to provide a formular that is used to calculate this "standard error"? What exactly is it? –  Aug 07 '14 at 16:19
  • 2
    NB Stochastic dominance of group $i$ over group $j$ means $F_{X_i}(x)<F_{X_j}(x)$ for all $x$ (where $F(\cdot)$s are the cumulative distribution functions), which is not implied by $\Pr(Xi>Xj)>0.5$ without the additional assumption that the cdfs don't cross. See non-transitive dice. – Scortchi - Reinstate Monica Aug 07 '14 at 16:49
  • The formula is simple but requires a double loop not readily implemented in SPSS. – Frank Harrell Aug 07 '14 at 18:00
  • Thank you, all. this helps a lot! Unfortanutely, I am still puzzled what I can report.

    The most important open part about the question: Can I report a U-statistic and a p-value saying that e.g. values from group Y are statistically significantly higher compared to group X?

    p.s. @ Frank: can the values be calculated manually? If so, I would highly appreciate if you could post the generall formular.

    –  Aug 07 '14 at 18:38
  • @user3669454 Following my answer above, you can report: If not rejecting H${0}$, "found no evidence that any group stochastically dominates any other group (under the assumption that the population CDFs of each group do not cross)"; or, if rejecting H${0}$, "found evidence that at least one group stochastically dominates at least one other group (under the assumption that the population CDFs of each group do not cross)." You cannot report difference in group means, or difference in group medians without additional (and much more stringent) assumptions. – Alexis Aug 07 '14 at 19:21
  • @Alexis, thank you very much for this explaination. Only two questions remain: 1) Does the assumption regarding CDF equate to the statement that no ties can be present? 2) Is there any quantiative statement I can make if H0 is rejected, e.g. as suggested here by the book source: deviation from H0 = U / m*n [on p. 268 of http://books.google.de/books?id=dPhtioXwI9cC&pg=PA265&dq=Median+Test&hl=de&sa=X&ei=L-fjU-m9F4aG4gStvIDAAg&ved=0CFkQ6AEwBw#v=onepage&q=whitney&f=false) ] –  Aug 08 '14 at 08:00
  • @user3669454 (1) I don't think so... since that assumption is about population CDFs, but we getting out of my comfort range there; (2) If you look at the null and alternative hypotheses I provided, you will see that they are stated in quantitative terms (i.e. probabilities of events). If you are looking for some way of saying "mean" " or "median", then not without stringent additional assumptions, no. – Alexis Aug 08 '14 at 15:33