4

Setup:

Assume I have a linear mixed effects model of the form: outcome ~ x + f + (1|ID), where x is a between-subjects predictor variable, f is a within-subject factor with 2 levels and ID is a subject ID. Each subject has 2 outcome values (one for each level of f).

Question 1:

What is the correct way to interpret the effect of x in this model?

My Stab at Answering Question 1:

My current interpretation is as follows:

If we compare two groups of subjects, A and B, in the target population represented by the subjects in our study such that:

  1. Both groups contain subjects having the SAME random subject effect (e.g., random subject effect = 0 for subjects in both groups);

  2. Group A has x = something; Group B has x = something + 1 (so they differ by 1-unit in the value of x);

  3. Groups A and B have the same value of f.

then the slope of x in the model y ~ x + f + (1|ID) captures the difference in the mean value of y among groups A and B.

Does this interpretation make sense? If not, what is the proper interpretation of the slope of x? (I should add that, in my case, x is something I have to control for and the effect of f is what I am really interested in.)

Question 2:

Given the above setup, would it make sense to modify the model to allow for a smooth, non-linear effect of x via a smooth of x, so that the modified model looks like outcome ~ s(x) + f + (1|ID)? If yes, what would be the correct way to interpret s(x) in the modified model?

Addendum

Here is what I managed to find in Jon Wakefield's 2013 book on Bayesian and Frequentist Regression Methods (published by Springer):

enter image description here

enter image description here

Isabella Ghement
  • 20,314
  • 2
  • 34
  • 58
  • 1
    In the case of a model with an identity link, the coefficient of $x$ is interpreted as you would in a linear regression model, based on this answer. And to question 2: I don't see any reason why not. Personally, my default assumption is that of nonlinear effects, especially for a control variable such as age. I always include those flexibly using splines. Frank Harrell has written a lot about that. – COOLSerdash Aug 03 '22 at 19:20
  • 1
    Thank you very much for your comment, @COOLSerdash. I asked Question 2 on Twitter and the feedback I got there was that it doesn't make sense to fit a smooth with just 2 values of x per subject. But my take is similar to yours - that it would make sense, provided we adopt an interpretation similar to what I outlined in my own stab at answering Question 1. I just wanted to check that I am not off in my interpretation - this has been driving me crazy for a while now. – Isabella Ghement Aug 03 '22 at 19:44
  • 1
    I might miss something but the number of values per subject seems irrelevant to me. The smooth will fit over the range of $x$ of all subjects (e.g. for age). Does the value of $x$ differ within a subject? – COOLSerdash Aug 03 '22 at 19:48
  • 1
    The value of x is constant within a subject since x is a between-subject predictor. Each subject has 2 values of y (outcome), 2 values of f (factor; within-subject predictor) and 1 value of x (continuous; between-subject predictor). I kept things simple but the additional complication is that y does not have a (conditional) Gaussian distribution - it actually has a zero-and-or-one-inflated beta distribution. But Questions 1 and 2 would be similar. – Isabella Ghement Aug 03 '22 at 20:05
  • 1
    Thanks for this information. As Dimitris Rizopoulos explains in the linked answer from my first comment: The effect of $x$ is only unconditional on the random effects in the case of an identity link function. In a model with a nonlinear link function, the coefficient is dependent on the random effect and thus has a subject-specific interpretation. I don't see any changes for question 2, however. – COOLSerdash Aug 03 '22 at 20:12
  • 1
    I guess what I am trying to ultimately clarify in my mind is what it means in a model with non-identity link to condition on ID via (1|ID). To me, (1|ID) has the interpretation "with the same subject random effect". That can happen if 1) we consider a single subject BUT ALSO (?) 2) consider multiple subjects in the target population "with the same subject random effect". So "subject-specific interpretation" can mean, for 2), "considering all subjects in the target population with the same subject random effect". Am I off in thinking 2) is possible? – Isabella Ghement Aug 03 '22 at 20:24
  • 1
    I have never heard of your second interpretation. I'm not an expert in this but if the random effects are considered coming from a continuous distribution (the most prevalent is the normal distribution, even for GLMMs), the probability that any two subjects have the same random effect is effectively zero, I think. I'm curious about what others have to say on this. – COOLSerdash Aug 03 '22 at 20:44
  • I am curious too! – Isabella Ghement Aug 03 '22 at 22:18
  • I added an excerpt from Jon Wakefield's book in an Addendum to my question, which would seem to indicate that my initial interpretation may be off (?) and that we should focus on 3) comparing a subject A and a subject B, both of which have the same subject random effect. BUT the excerpt DOES talk about two individuals with the SAME random effect! – Isabella Ghement Aug 03 '22 at 23:01

0 Answers0