Should the repeated measures variable be coded as a factor or as a numeric variable in lmer function?

Question

I refer to this post

I use the same library and also the same dataset

library(tidyverse)
library(lmer4)
str(sleepstudy)
'data.frame':   180 obs. of  3 variables:
 $ Reaction: num  250 259 251 321 357 ...
 $ Days    : num  0 1 2 3 4 5 6 7 8 9 ...
 $ Subject : Factor w/ 18 levels "308","309","310",..: 1 1 1 1 1 1 1 1 1 1 ...

I want to compute a linear mixed model with a random intercept and a random slope:

fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)

This is executed without any problems.

However, if I define the within-variable as a factor, I get the familiar error message

sleepstudy <- sleepstudy %>% 
  mutate(Days = factor(Days))
fm2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
Error: number of observations (=180) <= number of random effects (=180) for term (Days | Subject); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable

In my own data I have a treatment variable with 3 values. I measured the dependent variable at 3 measurement points. So far I have coded both treatment and the measurement as factor. I would like to compute the following model, but this did not work due to the error message shown above

model <- lmer(y ~ treatment + measurement + treatment*measurement + (measurement|subject) + (1|class), data = df)

So I wonder if I need to define the within-variable as a factor or as a numeric variable?

Are the three measurements taken at the same time points for all subjects? Also, can you explain the class variable: are subjects sampled from different classes? — dipetkov, Sep 02 '22 at 10:43
Thanks for your comment. The dependent variable was collected at the same time from all subjects. The variable class is the school class. All subjects belong to a single class. — Fabi, Sep 02 '22 at 13:19

score 3 · Answer 1 · answered Sep 02 '22 at 18:52

Since all subjects belong to the same class, you don't need the (1|class) component, which corresponds to a random class intercept.

And since the dependent variable was measured at the same three time points for all subjects, you could consider analyzing the data with Generalized Least Squares (GLS) instead of a linear mixed model (LMM). With GLS, you specify the correlation structure of the three within-subject measurements directly rather than model time as a random effect.

Here is an example using nlme::gls and the sleepstudy dataset.

library("nlme")
library("tidyverse")
data(sleepstudy, package = "lme4")
Filter down the data to the first 3 days to simulate your study data.
sleepstudy3 <- sleepstudy %>%
  filter(
    Days < 3
  ) %>%
  mutate(
    Days = factor(Days)
  )
model <- gls(
  Reaction ~ Days,
Specify a general correlation structure
correlation = corSymm(form = ~ 1 | Subject),
Specify constant variance for all Days
weights = varIdent(form = ~1),
Or let the variance vary by Day
weights = varIdent(form = ~ 1 | Days),
  data = sleepstudy3
)
model
#> Generalized least squares fit by REML
#>   Model: Reaction ~ Days

#> Correlation Structure: General
#>  Formula: ~1 | Subject 
#>  Parameter estimate(s):
#>  Correlation: 
#>   1     2    
#> 2 0.737      
#> 3 0.470 0.770

#> Variance function:
#>  Structure: Different standard deviations per stratum
#>  Formula: ~1 | Days 
#>  Parameter estimates:
#>         0         1         2 
#> 1.0000000 1.0404888 0.9173336

#> Degrees of freedom: 54 total; 51 residual
#> Residual standard error: 32.12945

Should the repeated measures variable be coded as a factor or as a numeric variable in lmer function?

1 Answers1

Filter down the data to the first 3 days to simulate your study data.

Specify a general correlation structure

Specify constant variance for all Days

weights = varIdent(form = ~1),

Or let the variance vary by Day