3

After using the Shapiro-Wilk test for Bivariate Normality, I found different values of p. For the ones which were p<0.05 I used Spearman and the ones which were p>0.05 I used Pearson. Is it the other way around? Just tell me right or wrong.

Stephan Kolassa
  • 123,354
Rach
  • 47
  • 4
    Welcome to Cross Validated! What are you trying to do with your data? For instance, why does a Shapiro-Wilk test come up? Why are you calculating correlations at all? Why calculate two different correlation measures in Spearman and Pearson? – Dave Sep 06 '23 at 19:12
  • 1
    "Pearson or Spearman?" Any of both, one, the other, or neither. It depends on what you're trying to do. See Dave's comment. – Galen Sep 06 '23 at 19:22
  • 8
    +1, because this is a confusion I see regularly in psychology theses. To be honest, I am baffled by why people believe normality has anything to do with the choice between Pearson's or Spearman's correlation coefficient, and this particular confusion would profit enormously from a canonical answer we could point people to (both from CV questions, and for confused students who had never heard about CV before). – Stephan Kolassa Sep 06 '23 at 19:26
  • 2
    @Stephan There are plenty of sites offering misinformation. The first up on Google--and therefore among the primary culprits--include https://www.statology.org/pearson-correlation-assumptions/, https://www.statisticssolutions.com/free-resources/directory-of-statistical-analyses/correlation-pearson-kendall-spearman/, https://www.statstest.com/pearson-correlation/, https://towardsdatascience.com/pearson-coefficient-of-correlation-explained-369991d93404, and (worst of all because they should know better) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3576830/. – whuber Sep 06 '23 at 19:37
  • 3
    @whuber That is disturbing. I wish we could make the issues known beyond this comments section. Ahem (for anyone who has time). – Galen Sep 06 '23 at 19:40
  • 7
    @Galen: my psych prof wife has been pestering me to write The One Article Refuting All The Statistical Errors In Psychology Theses, so she can direct her students there before they start testing everything for normality. I should spend less time here and more writing that paper. – Stephan Kolassa Sep 06 '23 at 19:44
  • 2
    @StephanKolassa It's not just psychologists! Although they may be the worst, the malady is widespread. Bad statistics, bad graphics, elementary errors..... Peter Rohlof and I recently wrote a paper for BMJ about categorizing variables: https://bmjpaedsopen.bmj.com/content/7/1/e001908

    I sometimes browse through Science or Nature and almost every issue has bad graphs.

    – Peter Flom Sep 06 '23 at 21:07
  • 1
    @Galen The AHEM paper you listed is ..... well, an article about how to use correlation should be correct, and this one isn't. In particular, it makes the normality error (which is, I guess, why you labeled it "ahem". At least it's in a journal that almost no one reads. – Peter Flom Sep 06 '23 at 21:17
  • @PeterFlom You got it. – Galen Sep 06 '23 at 21:19

3 Answers3

15

Neither correlation coefficient presupposes normality. Marginal or bivariate normality is completely irrelevant to the choice between them.

They do differ in the questions they ask of the data. Pearson's correlation coefficient assesses a linear relationship, and is closely related to simple linear regression. Spearman's correlation coefficient works on ranks and therefore does not assess a linear relationship.

For an illustration, generate some bivariate data and calculate your correlations. Then take the top datapoint, and move it up. The Pearson correlation will change, Spearman's won't, since the ranks that Spearman's correlation depends on do not change. Similarly, move the rightmost datapoint out to the right, or the bottom one down or the leftmost one to the left.

Stephan Kolassa
  • 123,354
  • 4
    "Neither correlation coefficient presupposes normality." Exactly, they are just functions of random variables. (+1) – Galen Sep 06 '23 at 19:28
  • "Spearman's correlation coefficient works on ranks and therefore does not assess a linear relationship." Except in extremely special cases where the ranks are invariant on the data. This can occur with ratings such as Likert-like scales. – Galen Sep 06 '23 at 19:30
  • 3
    Spearman's correlation quantifies the bilinearity of the ranks, as it is just Pearson's correlation on the ranks. Since the ranks are order-preserving, what we're effectively learning about is the tendency of monotonicity on the original variables. Linearity on the ranks implies monotonicity on the original random variables. – Galen Sep 06 '23 at 19:33
  • Thank you very much for your answer. The thing is, this is the first time carrying out research. My mentor told that that's what we should do, after normality check, Pearson or Spearman based on p value...

    My mentor said: p<0.05 - Pearson, p>0.05 Spearman

    – Rach Sep 06 '23 at 19:34
  • 2
    @Rach Your mentor is suggesting a poor statistical practice. I think you should dive into learning how these two statistics really work so you can decide for yourself. – Galen Sep 06 '23 at 19:37
  • 4
    "Spearman's correlation coefficient works on ranks and therefore does not assess a linear relationship." I think you could be a smidge more illuminating: Spearman's $_{r}$ measures monotonic association between two paired raw variables, and measures the linear association between the ranks of two paired raw variables. – Alexis Sep 06 '23 at 19:42
  • Thank you very much! – Rach Sep 06 '23 at 19:42
  • 2
    +1. Also @Rach if you feel his answer sufficiently provides a solution to your question, feel free to hit the checkmark next to his answer to indicate that the question is resolved. – Shawn Hemelstrand Sep 07 '23 at 00:01
  • Standard distribution theory regarding Pearson correlation, which is required when running tests or confidence intervals, does assume normality. I'm not saying this means it should be tested using S-W (we have a paper on model assumption testring in general, recently accepted, https://arxiv.org/abs/1908.02218 ), however if you want to do more than just computing the correlation, normality is relevant. – Christian Hennig Sep 07 '23 at 17:37
  • @ChristianHennig: interesting. Do you have details? Are you referring to normality of the raw data, or (asymptotic) normality of the correlation coefficient under the null hypothesis? For instance, if I repeatedly run a standard test for Pearson correlation between $n=20$ bivariate uncorrelated uniforms (or many other non-normal raw data distributions), the p-values are utterly uniformly distributed, as they should be. – Stephan Kolassa Sep 08 '23 at 07:17
  • @StephanKolassa That's fair enough, and I agree with you that inference derived under normality assumption may work well also in other situations. But then it may well not (try the same with Cauchy distributions). My point is just that there is theory for the normal case, whereas for the non-normal case most theory is at best asymptotic, and inference may or may not work well. This is just not appropriately summarised by saying that "normality is completely irrelevant". – Christian Hennig Sep 08 '23 at 09:49
5

Just tell me right or wrong.

Not so simple as that. You had it the "right" way around but the premises on which it is based are not really right.

There's multiple errors here, of various kinds. I haven't managed to address everything but it's a start.

The thing that should determine what kind of correlation you want to use is what you are trying to find out/measure. If you want a linear correlation, use one. If you want to measure the tendency of two variables to increase together (monotonic correlation), do that.

Then figure out how to test what you need to measure. (We're able to help with that!)

To use a Pearson correlation - a kind of linear correlation - you do not need anything to be normal.

The only time distributional assumptions are really needed is when those assumptions are made in deriving properties needed in performing inference (such as testing them), but you can avoid making those assumptions and make different ones, or even avoid explicit distributional assumptions altogether.

After using the Shapiro-Wilk test for Bivariate Normality,

This is something of a mistake in several senses; even if you make a normality assumption (which is not necessary) there's no need to assume bivariate normality at all. The usual ordinary regression assumptions - including linearity, independence of observations, conditional normality of one of the two variables - are sufficient to establish the properties of the "usual" test where the null is $\rho=0$ whether or not you see one of the variables as a DV and the other as an IV. This has been explicitly stated in the literature at least since 1927 (i.e. over 95 years, though I don't have that specific reference to hand right now). Doubtless it was understood by many people before that, but we needn't take their word for it; if you can't see that it is correct you can always simulate to see that the test still has the required properties (exactly, though with simulation we can only tell as closely as we patience to wait for).

But that's just what needed to be assumed for the derivation in any case. The next issue -- testing that assumption -- is itself not particularly useful (and may be counterproductive in several senses [1]). The question is not whether you have bivariate or even conditional normality (you won't, a sufficiently large sample will eventually tell you so). The real question, echoing George Box, is how wrong does it have to be to not be useful?. In this case, the issue is something like how much non-normality, of what kind, is needed to have a serious adverse impact on the properties of our procedure? - in short, what does it take to make the significance level (some people will also want to consider power) materially different from what we're expecting.

The significance level of the usual test is reasonably robust to non-normality of the conditional distribution. With significance level in particular, this improves with increasing sample size. A test of normality - even a more suitable one than the one you used - will tell you more and more accurately (with greater and greater power) - that the original distribution (whether we're talking about the bivariate distribution or the conditional distribution) is not normal ... while it actually matters less and less.

And again, even when it does matter, you don't have to assume conditional normality to use Pearson correlation.

You can make a different parametric assumption -- but you said nothing about your variables so no advice is possible there; in many cases a reasonable model is straightforward -- or you can avoid a parametric assumption and still use the Pearson correlation.

Of course if you wanted monotonic correlation in the first place, Pearson was the wrong thing to consider from the start.


[1]: for example, if you adopt the procedure "test the assumption; if you don't reject, do test A, otherwise do test B" itself impacts the properties of the procedure. The very property you seek to guarantee by this mechanism (the significance level of test A when you conduct it) is itself affected by this scheme.

Glen_b
  • 282,281
  • The arguments against model assumption testing make sense, but this doesn't necessarily mean that not doing it is better. https://arxiv.org/abs/1908.02218 – Christian Hennig Sep 07 '23 at 17:40
  • I wouldn't ever suggest ignoring assumptions. Test or not test are not the only possibilities and the plausibility of assumptions can very often be considered a priori. It's important to note that for correctness of significance levels, the situation that applies is under H0, which with equality nulls is usually strictly false, so looking at the data (where H1 may be true, and likely is the case) to check the assumptions under H0 may be misleading. I may return and make some additional comments later. – Glen_b Sep 07 '23 at 18:21
3

I wouldn't go off of what $p$ values are associated with each correlation. The $p$ value here doesn't indicate any magnitude of the effect. More importantly, they really don't say anything about how the correlations actually map to the data either. You can have something like the plot below, which gives a statistically significant effect and strong magnitude for both coefficients, but one correlation is clearly better than the other.

enter image description here

In this case, the Spearman $\rho$ coefficient is the stronger contender, because this is a monotonic function versus a strictly linear one. Stephan already mentioned this in his answer, but I felt a visual example like that above makes it clear about what he was talking about.

As for normality, I recently discussed my issues with normality tests in general, particularly the Shapiro-Wilk, which can be easily flagged from slight departures in normality. I think its much better to simply look at visualizations of the data such as histograms, density plots, or QQ plots. I've been in a positions where I have had essentially a completely linear QQ plot but because of sample size the data still gets flagged for being non-normal.

Image source from Wiki.

  • The image you have shared appears identical to this one which has a CC BY-SA 3.0 license. You are free to use the image as you have, but you should credit this source. (Any easy way to do this is just to link to the original image instead of pasting to imgur.com) – Galen Sep 07 '23 at 03:40
  • Ah good point. Linked in edit. – Shawn Hemelstrand Sep 07 '23 at 03:49
  • The arguments against model assumption testing make sense, but this doesn't necessarily mean that not doing it is better. arxiv.org/abs/1908.02218 In fact the issues with formal model assumption testing are known because one can do theory and simulations about it. For model checking using visualisation this cannot be done, so nobody can know whether this is in general better or worse. (It will depend on who does it in the first place.) – Christian Hennig Sep 07 '23 at 17:42
  • I guess case-by-case it would depend, but generally speaking I've found it to be more deceptive than helpful, which is why I think it warrants mention. – Shawn Hemelstrand Sep 07 '23 at 22:54
  • As I said in the chat too, any assumptions shouldn't be blindly relied on. Assess the rationale behind them. Understand the context. It might make sense in some scenario but that doesn't validate the status of it being imposed for all situations, even when not necessary at all. – User1865345 Sep 08 '23 at 01:31