in general, terms shouldn't be in RE if they're missing from FE
(By "terms" here I mean "terms that vary among groups", not "grouping variables"; in formula notation, that means having a term like (f|g) where f is not in the fixed effects)
Because random effects are mean-centered, having f vary among groups but not be included in the fixed effects specifies that it has an effect of exactly zero in the population but nevertheless varies among groups. This is usually not sensible, although it does apply in some cases (e.g., if the response variable itself has been standardized so that its population-level mean is zero, then you could have the intercept vary across groups ((1|group)) but drop the intercept from the fixed effects (+0 or -1).
(Therefore, models 1 and 3 are not generally a good idea.)
is time a numeric variable or a factor?
If time is numeric, the linear model corresponding to ~time (we'll see why this is important in a minute) consists of an intercept and a slope with respect to time. If it's a factor, then the linear model consists of an intercept and a set of contrasts — i.e., one parameter per distinct time value — or (if we use ~0+x) an indicator/dummy variable for each time value. Thus there are three ways that time could show up in a RE:
- (time is numeric):
(time|id) denotes a random-slope model, with among-group variances for the intercepts and slopes and an among-group covariance between intercepts and slopes: the random-effects component is $b_{i,0} + b_{i,1}(t)$ (where $i$ is a group index), $b \sim \textrm{MVN}(0, \Sigma)$, $\Sigma$ is $2 \times 2$
- (time is a factor/categorical):
(time|id) denotes a model where the variability across groups is different at every time step, and where there are different correlations between every pair of times (an $n \times n$ covariance matrix if there are $n$ distinct times): RE component $b_{i,j}$ ($i$ = group, $j$ = time), $b \sim \textrm{MVN}(0, \Sigma)$, $\Sigma$ is $n \times n$. The latter model (which I'll call 'unstructured' time variation) tends to be data-hungry ($n(n+1)/2$ covariance parameters to estimate)
(1|time): here time is treated as a grouping variable rather than an effect (automatically converted to a factor). RE component $b_j$, $b_j \sim N(0, \sigma^2)$
nested random effects
: is an interaction operator, so (1|time:id) corresponds to variation in the intercept across subject-by-time combinations. Thus (1|id) + (1|time:id) (which can also be abbreviated as (1|id/time) corresponds to variation in the intercept (denoted by the 1 to the left of the bar) across subjects and across times within subjects. The RE is $b_{1,i} + b_{2,ij}$, $b_1 \sim N(0, \sigma^2_1)$, $b_2 \sim N(0,\sigma^2_{2})$. This nested model is also called a (homogeneous) compound symmetric covariance structure because it corresponds to a model where variation among herds is the same at every time point and correlation across herds is the same for every pair of time points (e.g. see here, search for "compound symmetry")
With all those concepts, we can say (leaving out everything inessential, including pred1:
~(time|id): no time effect at the population/fixed-effect level (probably silly), random slopes or unstructured time variation depending on whether time is numeric or categorical
~ time + (time|id): as above, but with a time effect at the population level
~ (1|id) + (1|time:id): no time effect at the population level. Variation among subjects and among times within subjects
~ time + (1|id) + (1|time:id): ditto, but with a population-level effect of time
Recommendations
- For a moderate-sized data set with many (more than 3 or 4) distinct time points, where the within-group trends are not obviously nonlinear, I would generally recommend the random-slopes model (model 2 with numeric times)
- For a moderate to large data set with lots of times and nonlinear patterns, I would recommend a low-order polynomial model in time (if that's adequate), or a hierarchical GAM (e.g. see the excellent paper by Pedersen et al. 2019)
- With a small number of distinct time points I would suggest the unstructured-time model for a large data set, falling back to the compound-symmetry model if necessary for parsimony
Pedersen, Eric J., David L. Miller, Gavin L. Simpson, and Noam Ross. 2019. “Hierarchical Generalized Additive Models in Ecology: An Introduction with Mgcv.” PeerJ 7 (May): e6876. https://doi.org/10.7717/peerj.6876.
time? – Alex J May 30 '23 at 01:56