Choosing Random Effects to Include in a Linear Mixed Model

Question

I'm trying to run a linear mixed model (in R) but my model either never seems to finish running or (with a simpler random effects structure) there is a warning about singular effects. My full model is below (this is the version that runs for ages and never completes):

RT_lme <- lmer(RT ~ Condition * HighLow * Correct + 
                    (Condition * HighLow * Correct | Pt_ID) + 
                      (Condition | SentNumb),
               data = testing_data.df, control = lmerControl(optimizer = "bobyqa"))

The task asks participants (Pt_ID) to judge whether a sentence (there are different sentences, indicated by 'SentNumb') is correct or not as quick as possible (so reaction time RT is the dependent variable). Sentences are either classed as 'high' or 'low' ('HighLow') and appear in 3 different 'Conditions':

Pt_ID	SentNumb	HighLow	Condition
1	1	High	1
1	1	High	2
1	1	High	3
1	2	Low	1
1	2	Low	2
1	2	Low	3

A participant's response accuracy is also recorded ('Correct').

I'm not sure how to decide which random effects to remove in order to simplify the model?

score 3 · Answer 1 · answered Mar 07 '23 at 10:56

Your model has an overly complicated random-effects structure. I would suggest first thinking about your research question, i.e., which associations between RT and the variables Condition, HighLow, and Correct do you want to study, and translate this to the specific fixed effects terms to include in the model. In the random effects, you could start first using random intercepts, i.e., (1 | PatID). Note that the random effects are used to account for correlations; hence, you need to think at which levels you expect correlations and, then include additional random effects terms if required. You could test if you need these extra random effects using likelihood ratio tests implemented via the anova() function.

Also, it was not clear from your question if the outcome variable RT is a dichotomous one. If this is the case, you will need to use glmer(..., family = binomial()) or mixed_model(..., family = binomial()) the latter from the GLMMadaptive package.

Thank you for your help. As a follow-up, I've seen some (but very little) use of centering of categorical variables to resolve convergence issues related to multicollinearity caused by interaction terms. I realise my model is far too complex as it is, and this will not resolve the issue, but I would be interested in your thoughts on this? There seems to be very limited literature on how to interpret centred categorical predictors. — SilvaC, Mar 10 '23 at 11:27

Shawn Hemelstrand · Accepted Answer · 2023-03-07T12:10:55.283

The singular effects error is no accident and is fairly common in fitting complicated interaction models (Meteyard & Davies, 2020). This typically happens when a mixed effects model has a random effects structure specified that doesn't fit the data well. For example, there may be very little variance in the way you model it and will result in almost negligible estimates. This can lead to problems like an undefined determinant or an inability to invert the matrix.

As Dimitris already mentioned, it may help first to fit a random effects structure that only includes random intercepts. Typically, these are much easier to fit, and when the data does not match a complicated structure, it often has more power (Matuschek et al., 2017). From there, you can attempt to try something like a non-correlated random effects structure (slopes and intercepts are not correlated) using the || operator in between the random effects you specified, such as (Condition || SentNumb) (see Bates et al., 2015 for more details on syntax of glmer fits). Probably the hardest part for your model to fit will be the three way interaction both in your fixed and random effects. You may want to consider whether or not they are even meaningful/interpretable to include, as they can anyways be a major contributor to convergence issues (Meteyard & Davies, 2020).

By the way, I have been a part of a project that used GLMMs with reaction time and learned that you should probably not use lmer unless you transform the data, as it typically has an inverse Gaussian distribution (heavily right-skewed). There is an excellent paper on this subject (Brysbaert & Stevens, 2018). Basically, it is ideal to first transform the data like so before fitting to lmer:

$$ invRT = \frac{-1000}{RT} $$

Alternatively, you can fit the data with glmer using the family=inverse.gaussian argument, but it makes interpreting and trouble shooting your data more difficult.

Side Note: For full disclosure, you should probably know that a paper before Brysbaert & Stevens, 2018 once advised against transforming reaction time data for mixed models and instead only using glmer to fit models (Lo & Andrews, 2015). However, I find their arguments were not as compelling as the Brysbaert & Stevens paper, so I would simply read both to make up your own mind. However, I have cited it below in case you may want to look into their argument.

Citations

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1). https://doi.org/10.18637/jss.v067.i01
Brysbaert, M., & Stevens, M. (2018). Power analysis and effect size in mixed effects models: A tutorial. Journal of Cognition, 1(1), 9. https://doi.org/10.5334/joc.10
Lo, S., & Andrews, S. (2015). To transform or not to transform: Using generalized linear mixed models to analyse reaction time data. Frontiers in Psychology, 6(1171). https://doi.org/10.3389/fpsyg.2015.01171
Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10.1016/j.jml.2017.01.001
Meteyard, L., & Davies, R. A. I. (2020). Best practice guidance for linear mixed-effects models in psychological science. Journal of Memory and Language, 112, 104092. https://doi.org/10.1016/j.jml.2020.104092

Thank you for your help. As a follow-up, I've seen some (but very little) use of centering of categorical variables to resolve convergence issues related to multicollinearity caused by interaction terms. I realise my model is far too complex as it is, and this will not resolve the issue, but I would be interested in your thoughts on this? There seems to be very limited literature on how to interpret centred categorical predictors. — SilvaC, Mar 13 '23 at 10:02
For your follow up question, this paper may be helpful: https://quantpsy.org/pubs/yaremych_preacher_hedeker_(in.press).pdf. Also FYI you could consider accepting one of the answers here if you feel one of us has answered your original question (by clicking on the checkmark next to an answer). — Shawn Hemelstrand, Mar 13 '23 at 10:40

Choosing Random Effects to Include in a Linear Mixed Model

2 Answers2

Citations

Linked