Using a variable as a fixed effect and random effects grouping variable in linear mixed effects model

Question

I am trying to decide what is the right random effects structure for my given experimental design. I've read quite a few of the other posts regarding linear mixed effects models and have come across papers like the Barr et al., 2013 paper suggesting to use the maximal random effects structure for confirmatory hypothesis testing. However, for the random effects structures that I am deciding between, I have not come across a representative example and would love to get input from others more experienced with LMEs.

For context, I have a study with 15 subjects. Each subject (on a different day) conducts a task with a different pair of shoes on. I have 5 different pairs of shoes (treated as a categorical variable A,B,C,D,E). They conduct the task multiple times in the morning and multiple times in the afternoon. I want to understand how the pair of shoes and time of day ultimately impact their performance on the task.

Since my two factors of interest are the shoe and time of day I have been treating those as fixed effects. I know there is subject variability, so I have also been including a random intercept term for each subject. However, the shoes and time of day might have a different effect on each subject, so I have been thinking I should include a random slope for these terms as well. Then, my formula would be:

Y ~ Shoe + TimeOfDay + (Shoe + TimeOfDay | Subject)

I generally think this makes sense, but the only concern I have is that because the subject wears a different shoe on a different day, there might be some variability from day-to-day. So instead, I have been thinking of the following random effects structure to capture that:

Y ~ Shoe + TimeOfDay + (TimeOfDay | Shoe:Subject)

Now my grouping variable would be every combination of shoe/subject (essentially each day). And I am saying that each day, the time of day could have a different effect.

My core question is: Can I have a random effects structure like the latter? Where I have a fixed effect also used as part of a grouping variable in a random effect? It seems weird to me to look at variation across days as a random effect when part of the variation across days is what I am interested in capturing as part of the "Shoes" fixed effect. Given my design, are there any suggestions on how to decide between the two random effects structures that I am considering?

score 2 · Answer 1 · answered Nov 12 '23 at 15:19

Discussion

It is unfortunate that many come across the Barr et al., 2013 paper before they see discussion about why the advice in that paper is poor (or at least doesn't acknowledge the major flaws with maximal models when specified by default). A good paper on the subject is Matuscheck et al., 2017, which talks about some of these issues, namely the sacrifices in power that arise from this design but more importantly the poor convergence rates of many maximal models. The type of data you have will largely shape what random effects you may actually be able to employ. You can check out the guides below in the references for how to optimally fit your models.

It would be good to summarize and plot your data with respect to the model you are trying to fit first. If it becomes clear that your data doesn't have meaningful variation with respect to random slopes, a random intercept-only model is often the better and easier choice for fitting. If your data indeed has large variation in slopes and intercepts, then perhaps you can consider a maximal model, but this still requires that you have enough data to even achieve that.

As for your model specification, there doesn't appear to be anything inherently wrong with the first model, but given that you include multiple random slopes in one cluster, you are more prone to some of the convergence/power issues I noted if your data doesn't match the model. The second model isn't specified as well, as it assumes that subjects are nested within shoes, where it should be shoes within subjects, noted (TimeofDay | Subject/Shoe). This is because each subject tries on every pair of shoe (note that vice versa isn't possible true). Usually you want to specify the group "g" as (1 + x | g1/g2), as this includes both the overall random intercepts of g1 but also that of g2 within g1.

References

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255–278. https://doi.org/10.1016/j.jml.2012.11.001
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1). https://doi.org/10.18637/jss.v067.i01
Brown, V. A. (2021). An introduction to linear mixed-effects modeling in R. Advances in Methods and Practices in Psychological Science, 4(1), 1–19. https://doi.org/10.1177/2515245920960351
Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10.1016/j.jml.2017.01.001
Meteyard, L., & Davies, R. A. I. (2020). Best practice guidance for linear mixed-effects models in psychological science. Journal of Memory and Language, 112, 104092. https://doi.org/10.1016/j.jml.2020.104092

Thanks! This is very helpful. For the comment about the second model not being well specified - I think that might be a notation difference between MATLAB and r. Specifically I am using MATLAB's fitlme function which uses Wilkinson's notation for defining the model equation: https://www.mathworks.com/help/stats/fitlme.html. Specifically, using Shoe:Subject should consider the independent interaction between Shoe and Subject. However, I am still concerned whether including a random intercept for the combination of shoe/subject is misinformed if I also have a Shoe fixed effect — rhingo3, Nov 12 '23 at 16:50
You cant have shoe as both a random slope and random intercept, as shoe doesnt vary within shoe. Its acceptable to have as a fixed effect and random slope. — Shawn Hemelstrand, Nov 12 '23 at 23:09
Sorry, it may not have been clear. I am not trying to make shoe both a random slope and random intercept. I am wondering if I can have a random effects structure as follows: (TimeOfDay | Subject:Shoe). In MATLAB, the notation Subject:Shoe means that I will have a random intercept for each combination of subject and shoe. I am not familiar with R notation, but my guess is that the full model structure in R would be something like: Y ~ Shoe + TimeOfDay + (TimeOfDay | Subject/Shoe). As you will see, Shoe is both a fixed effect and part of the random intercept grouping — rhingo3, Nov 13 '23 at 05:33
Your intuition seems correct for how this is with R with a minor caveat. To clarify, the : operator here shows the combination of both random effect clusters here in the way you describe for Matlab (it will display all combinations of both). The / operator also adds the overall intercept of Subject, so if you are only interested in the Subject by Shoe combinations then you can just specify it with : instead of /. — Shawn Hemelstrand, Nov 13 '23 at 05:54

Using a variable as a fixed effect and random effects grouping variable in linear mixed effects model

1 Answers1

Discussion

References

Linked