Mixed model residuals are not normal

Question

I am modeling data from an experiment with a mixed model. The outcome variable is a percentage. There are three fixed effects, Condition: diseased and healthy, Time point: 1, 2 and 3 , Drug: A,B,C,D,E. Subject is taken as the random effect. I need to perform three tests:

Check for a significant difference between disease and healthy for time point 1 and drug A, and so on for all the combinations of time points and drugs
Check for significant difference between Time point 1 and 2 for all the combinations of drugs and conditions
Check for a significant difference between Drug A and B for all the combinations of time points and conditions.

The data is unbalanced

This is what I did:
1. Build a linear mixed model

fit_1 <- lmer(y~Condition*Drug*Timepoint+(1|Subject))

2) Use lsmeans to perform tests

lsmeans(fit_1,pairwise~Condition | Drug * Timepoint,adjust="none")
lsmeans(fit_1,pairwise~Timepoint | Drug * Condition,adjust="none")
lsmeans(fit_1,pairwise~Drug | Condition * Timepoint,adjust="none")

However, none of the p-values were less than the nominal 0.05 alpha-level. The inference on these values was significant when Wilcoxon test was used. So I went back to check the residuals and they seemed to violate the assumptions of normal distribution and homoscedasticity. Should I use GLMM instead? If so, which family will be applicable when $y$ variable is a percentage of counts data?

It's hard to interpret the results of a Wilcoxon when applied on dependent data. The Wilcoxon doesn't allow for adjustment as in the mixed model. Even if the data were independent, the Wilcoxon has a different interpretation than a bivariate least squares model. It's good to look at the distributions of data. Producing some xyplots can help visualize the experimental conditions better. Those assumptions are rarely met, but the inference is usually valid, as many posts on this site will tell you! — AdamO, Apr 09 '18 at 17:29

Isabella Ghement · Accepted Answer · 2018-04-06T23:46:59.640

1

You can convert your dependent variable data from percentages to proportions and then check if all the proportions lie in the interval (0,1). If yes, you can use a GLMM with beta distribution, which can be implemented in R via the glmmTMB package. If some proportions are equal to 0 but the rest are in the interval (0,1), you can use a GLMM with a zero-inflated distribution. See How to fit a mixed model with response variable between 0 and 1? for more ideas. There is a way to get lsmeans to work with this type of models.

edited Apr 06 '18 at 23:46

answered Apr 06 '18 at 23:32

Isabella Ghement

20,314
2
34
58

1

Thank you Isabella. I tried using glmmTMB function in R but lsmeans doesn't support an object from this function. Can you suggest how can I do the post-hoc analysis at each level of the factors following the model fit? – AMC Apr 12 '18 at 18:05

Mixed model residuals are not normal

1 Answers1