3

I found some statements throughout the web that suggest that most common statistical tests can be performed as general(ized) linear models (cf. here). For a Wilcoxon test the author of the referenced source exemplarily suggests to first transform the original data into signed ranks and then to run a linear model. Imo this procedure won't give reliable p values because now the residuals are of course not normal but uniform. However, I wonder whether a generalized linear model could help here (e.g. after transforming the ranks to a range of [0,1] and then apply a GLM with beta link)? If so, which GLM could be applied here? And any idea how this analysis could be performed using R? (The Wilcoxon test was just an example - how could this look like for a Spearman correlation analysis?)

Anti
  • 127
  • 6
  • 2
    You can learn more about how regression models generalize statistical tests, incl. Wilcoxon test, in Biostatistics for Biomedical Research course notes. Available online. See Ch 7 for the Wilcoxon test in particular. – dipetkov Apr 28 '22 at 05:55
  • @dipetkov Thanks, I think I already understand the idea behind it. But I don't get why a linear model (usually assuming normality of residuals) can be used to derive a p value on non-normal ranks. – Anti Apr 28 '22 at 06:01
  • 2
    Even though you can use the "replace with ranks then run ordinary models" trick when there are two or more groups and get a good answer, the idea doesn't generalize to allow for adjustment for other variables. The proportional odds and other ordinal models come to the rescue. – Frank Harrell Apr 28 '22 at 11:29
  • Just came across two previous CV posts about this same approximation to the non-parametric rank tests and how good the approximation is: here and here. – dipetkov Apr 28 '22 at 12:18

2 Answers2

4

You might be interested in the proportional odds ordinal logistic regression (POOLR) model.

Wilcoxon-Mann-Whitney U is a score test of a POOLR model with one variable indicating group membership, similar to how an equal-variance t-test is a Wald test of a linear regression with one variable indicating group membership.

This is not the same as applying a regression model to the rank-transformed data, but that seems like a hack to use until one learns about the POOLR model.

Dave
  • 62,186
2

The author writes the following:

# Built-in
a = wilcox.test(y)

Equivalent linear model

b = lm(signed_rank(y) ~ 1) # See? Same model as above, just on signed ranks

But that is only the same model because the function wilcox.test is approximating the distribution of the mean of the signed ranks by a Normal distribution.

What happens is that the test estimates the error based on the residuals and computes a t-value and assumes that this is approximately t-distributed.

It uses the following formula:

$$t = \frac{\bar{X}}{\hat{\sigma} / \sqrt{n}}$$

in code you could use:

z = signed_rank(y)                         # compute signed ranks
t = mean(z) / (var(z)^0.5 / length(z)^0.5) # t-value
pt(t,n)*2                                  # p value

The Wilcox signed rank test does not follow a simple parametric distribution and it is difficult to compute. If we would compute the exact p value then we get a different result. The code below demonstrates this.

# data
set.seed(1)
signed_rank = function(x) sign(x) * rank(abs(x))
y = c(rnorm(15), exp(rnorm(15)), runif(20, min = -3, max = 0))  # Almost zero mean, not normal

Built-in

wilcox.test(y) # p-value = 0.7282 wilcox.test(y, exact = TRUE) # p-value = 0.7304

In the article that you refer to several models like anova and t-test can indeed be interpreted as linear models.

However, the non-parametric tests are only indirectly related to linear models. The relation happens because the distribution of the test statistic is approximately normal distributed under the null hypothesis (when the sample size is large).


Note: I tried to look at smaller sample sizes and found that the wilcox.test is actually estimating the standard deviation differently. It uses the function

        NTIES <- table(r)
        z <- STATISTIC - n * (n + 1)/4
        SIGMA <- sqrt(n * (n + 1) * (2 * n + 1) / 24
                      - sum(NTIES^3 - NTIES) / 48)

and it bases the estimate of the standard deviation purely on the sample size, and not on the data. But for large samples, they approach each other. That we get the same p-values might just be a coincidence and is due to the fact that all these approximations approach the same normal distribution.

  • 1
    Not directly related to the question but I can't figure it out... The proportional odds model generalizes the two-sample rank sum test. What about the one-sample signed rank test? Does it have a corresponding model "generalization"? – dipetkov Apr 28 '22 at 13:31
  • 1
    @dipetkov all these names of tests are always confusing me, are you speaking about the sum of ranks (or sum of signed ranks) or sum of signs? The sum of signs can be see as a transformed binomial distributed variable (if there are no ties, or only ties with almost zero probability). – Sextus Empiricus Apr 28 '22 at 13:34
  • I'll try to get the terminology right. There are two Wilcoxon tests. a) one-sample signed rank test. b) two-sample rank sum test. (I use the names that wilcoxon.test in R uses.) I can get (about) the same p-value with two-sample wilcoxon.test and rms::orm. But can't figure something similar with the one-sample wilcoxon.test. – dipetkov Apr 28 '22 at 13:41
  • @dipetkov I believe now that you confused me with the 'Proportional Odds' model. I actually did not know what you meant with that and thought you were speaking about the Mann-Whitney-Wilcoxon U-test which has a statistic that relates to a binomial distributed variable (that was my association with 'proportional odds'). – Sextus Empiricus Apr 28 '22 at 13:54
  • Sorry for confusing you. I asked because I don't understand it fully myself. The same test has about 5 names according to Wikipedia: U Test. This means there are at least 5 more names for it, probably. So I'll stop here. – dipetkov Apr 28 '22 at 14:01