How do I determine if differences between medians are statistically significant when notches are very close (see "C" and "D")?

Question

Regarding "C" and "D":

Is there a way other than visually inspecting the notches from the boxplots? I know the range of values for "D" is lower than "C" and the median value appears to be lower for "D". I also think that box "D" has less variability. I'm not sure what else is worth mentioning when I compare the two boxplots.

BruceET · Accepted Answer · 2022-03-18T07:39:03.607

There are several issues here.

First, I have always regarded notches in boxplots to be a rough suggestion whether medians are equal. Notches are calibrated for comparisons, at the 5% level, of two groups at a time.

Using boxplot notches as a graphical device, it is especially important to show the boxplots in a way that makes the notches easy to see. In this case, I think it would be helpful to plot the boxplots horizontally, rather than vertically. Also, the notches may be easier to see if the interiors of boxes have colors that contrast with the outlines.

Thus, I believe the righthand panel below shows the relative locations of the notches better than does the lefthand panel. [There are 500 observations in each plot.]

par(mfrow=c(1,2))
 boxplot(x1,x2, col="grey", notch=T)
 boxplot(x1,x2, notch=T, horizontal=T, col="skyblue2", pch=20)
par(mfrow=c(1,1))

Second, I think it is best to do a formal test whether the locations of the samples differ. If the distribution types of the populations are unknown, one can use a nonparametric test. Results for the Wilcoxon rank sum test are shown below. There is no doubt about significance at the 5% level.

wilcox.test(x1,x2)
    Wilcoxon rank sum test 
    with continuity correction


data:  x1 and x2
W = 108000, p-value = 0.0001981
alternative hypothesis: 
  true location shift is not equal to 0

Note: The following R code was used to simulate my fictitious data. Because populations are chi-squared, a parametric test might be used instead.

set.seed(2022)
x1 = rchisq(500, 10)
x2 = rchisq(500, 11)

How do I determine if differences between medians are statistically significant when notches are very close (see "C" and "D")?

1 Answers1