Much like the two-sample (equal variance, not Welch) t-test is a special case of ANOVA, and ANOVA is a special case of linear regression, the Wilcoxon Mann-Whitney U test is a special case of Kruskal-Wallis, and Kruskal-Wallis is a special case of proportional odds logistic regression. Let's simulate that claim about Wilcoxon, KW, and proportional odds logit.
library(rms)
library(MASS)
set.seed(2021)
N <- 25
B <- 100
p1 <- p2 <- p3 <- p4 <- rep(NA, B)
for (i in 1:B){
a <- rnorm(N)
b <- rnorm(N, 1)
y <- c(a, b)
x <- c(rep(0, length(a)), rep(1, length(b)))
L <- rms::orm(y ~ x) # Proportional odds logit
# model from Frank Harrell's RMS package
p1[i] <- wilcox.test(a, b)$p.value
p2[i] <- L$stats[7] # There are two p-values in the output
p3[i] <- L$stats[9] # There are two p-values in the output
p4[i] <- kruskal.test(y, x)$p.value
}
d <- data.frame(Wilcoxon = p1,
ORM_1 = p2,
ORM_2 = p3,
KW = p4)
plot(d)
cor(d)
Wilcoxon ORM_1 ORM_2 KW
Wilcoxon 1.0000000 0.9996366 0.9997652 0.9999543
ORM_1 0.9996366 1.0000000 0.9999797 0.9994056
ORM_2 0.9997652 0.9999797 1.0000000 0.9995917
KW 0.9999543 0.9994056 0.9995917 1.0000000
Au contraire!
While the differences in p-values are small, they are not small enough for me to attribute them to floating point arithmetic.
What is going on? For instance, is the equivalence only asymptotic?