2

I have an LMER, and I want to check for normality; my plots show this:

enter image description here

enter image description here

However, my Shapiro-Wilk test shows this:

    Shapiro-Wilk normality test

data: resid(lmerabsolute) W = 0.97875, p-value = 3.751e-06

What do I do? I would be so grateful for some advice!

Nick Cox
  • 56,404
  • 8
  • 127
  • 185
user275189
  • 133
  • 1
  • 8
  • 1
    Shapiro Wilk is sensitive to the number of observations. Too many, and it is easy to reject the null even when the data are "normal enough". Forget shapiro-wilk and make a decision based on your plots. From your residual plots, I would think something is up, but I would need to know more about the problem. – Demetri Pananos Mar 10 '20 at 19:05
  • @DemetriPananos what information would be useful to you? – user275189 Mar 10 '20 at 19:06
  • Well I would need to know what you're investigating and what the research question is. – Demetri Pananos Mar 10 '20 at 19:08
  • @DemetriPananos human ability to path integrate , they navigate through a virtual corridor to a pre-seen target location. Dependent variable is absolute error (m) (difference between the actual target location and the subjects stop location). The distance from the start of the corridor to the target was one of five different distances and thus 'target distance' was a within-subjects independent variable. The gain (difference in velocity from trial-to-trial) differed between sessions (3 sessions with 50 trials each). Thus, gain and distance were independent within-subject variables. – user275189 Mar 10 '20 at 19:17
  • So what explains why the residuals look that way? They don't seem random, they seem to all lay on lines. I'd need to see the data as well. – Demetri Pananos Mar 10 '20 at 19:19
  • 2
    It looks to me like your plots and the Shapiro-Wilk test are saying the same thing - the residuals are not Gaussian. Why do you think otherwise? – jbowman Mar 10 '20 at 19:21
  • ok @jbowman, I will put up a new post now with the same results but using a repeated-measures aov for the same data. Those results are much better. – user275189 Mar 10 '20 at 19:30
  • 1
    The assertion in your title is false (they both indicate non-normality), but in any case a potential difference between them (on other data) isn't necessarily relevant, since their purposes are different. The test is not answering a question you should care about. – Glen_b Mar 11 '20 at 05:41
  • 1
    It looks like your dependent variable is discrete. – Roland Mar 11 '20 at 07:05

1 Answers1

3

The plot and the Shapiro-Wilk test seem totally consistent with each other.

The test gives a tiny p-value, indicating that normality is basically out of the question.

The plot shows deviations from normality, especially at the top right. A normal distribution would give points around the diagonal line, while your points start drifting away from the diagonal line around $x=1$ and higher.

Note, however, that formal normality testing has a tendency to catch differences that, upon visual inspection, are easy to see will be trivial. The test is doing what it is supposed to be doing by flagging a slight deviation from normality as a deviation from normality, and I give a demonstration here. However, most of the time when normality is desired, we just need "close enough" to normality for downstream statistics to work as we want. Hypothesis testing for normality, particularly when sample sizes are large, is likely to catch deviations that have minimal impact on our work, even if the test is correct to notice the deviation.

Nick Cox
  • 56,404
  • 8
  • 127
  • 185
Dave
  • 62,186