I have a question on statistical mixed linear model analysis with nlme in R. There are a couple of posts that are related to my question, I think (see here and here), but I'm not sure they give the solution to my problem.
I would like to fit a linear mixed model to data from an experiment with crossed random effects for subjects and stimuli. Here are simplified data simulated based on my data and with great help from this paper and the accompanying R code.
library(tidyverse)
library(lme4)
library(nlme)
set.seed(1234)
parameters for group
beta_0 <- 55.145 # intercept
omega_0 <- 3.644 # by-item random intercept sd
tau_0 <- 7.671 # by-subject random intercept sd
sigma <- 26.683 # residual (error) sd
number of subjects and items
n_subj <- 150
n_items <- 36
items <- data.frame(
item_id = seq_len(n_items),
O_0i = rnorm(n = n_items, mean = 0, sd = omega_0)
)
subjects <- data.frame(
subj_id = seq_len(n_subj),
T_0i = rnorm(n = n_subj, mean = 0, sd = tau_0)
)
add subject IDs
subjects1$subj_id <- seq_len(n_subj)
trials <- crossing(subjects, items) %>%
mutate(e_si = rnorm(nrow(.), mean = 0, sd = sigma))
dat_sim <- trials %>%
mutate(y = beta_0 + T_0i + O_0i + e_si) %>%
select(subj_id, item_id, y)
I would typically use lme4::lmer() to fit a model to this data but I have a between subjects factor in my experiment and the data turned out to be heteroscedastic. As far as I know, I need to use nlme::lme() to fit different residual variances for each between-subjects group, so lme4::lmer() is not an option. Now, I would like to fit a model in nlme::lme() that matches my original model in lme4::lmer().
For demonstration, I fit intercept-only models. (In reality I would have fixed effects, too. But this is not relevant for the question I have.)
My lmer() model would look like this:
m.lmer <- lmer(y ~ 1 + (1|subj_id) + (1|item_id),
data = dat_sim,
REML = T,
control = lmerControl(optimizer="nloptwrap"))
With the help of this post, I pieced this model together for lme():
dat_sim$dummy <- factor(1)
m.lme <- lme(y ~ 1,
random = list(dummy =
pdBlocked(list(pdIdent(~subj_id-1),
pdIdent(~item_id-1)))),
data = dat_sim,
method = "REML")
And here is my problem: The two models give different estimates for fixed and especially random effects:
summary(m.lmer)$coefficients
# Estimate Std. Error df t value Pr(>|t|)
# (Intercept) 54.08182 0.8576796 102.7496 63.05597 5.819007e-84
VarCorr(m.lmer)
Groups Name Std.Dev.
subj_id (Intercept) 7.6215
item_id (Intercept) 2.8036
Residual 26.4984
summary(m.lme)
Fixed effects: y ~ 1
Value Std.Error DF t-value p-value
(Intercept) 53.66397 0.9505988 5399 56.45281 0
VarCorr(m.lme)
dummy = pdIdent(subj_id - 1), pdIdent(item_id - 1)
Variance StdDev
subj_id 6.408743e-04 0.02531550
item_id 6.743363e-03 0.08211798
Residual 7.655888e+02 27.66927555
But if I remove one of the random effects, my results are identical across lmer() and lme():
m.lmer.s <- update(m.lmer, ~ 1 + (1|subj_id))
VarCorr(m.lmer.s)
# Groups Name Std.Dev.
# subj_id (Intercept) 7.6072
# Residual 26.6463
m.lme.s <- update(m.lme, random = ~1|subj_id)
VarCorr(m.lme.s)
subj_id = pdLogChol(1)
Variance StdDev
(Intercept) 57.86926 7.607185
Residual 710.02435 26.646282
m.lmer.i <- update(m.lmer, ~ 1 + (1|item_id))
VarCorr(m.lmer.i)
Groups Name Std.Dev.
item_id (Intercept) 2.7336
Residual 27.5727
m.lme.i <- update(m.lme, random = ~1|item_id)
VarCorr(m.lme.i)
item_id = pdLogChol(1)
Variance StdDev
(Intercept) 7.472733 2.73363
Residual 760.252039 27.57267
Why do the models with both random effects give so such different results and why does this seem to depend on whether only one or both random intercepts are fitted? Is my model specification for lme() wrong? Or is there an issue with a local maximum when fitting two random effects with lmer() (see this post)? Could anyone explain to me what is happening here?
nlmeresults are nowhere near the simulated variances. This seems to suggest that the model you specified withlmedoes not do what you want it to do. I am more familiar withlme4, which to me looks correctly specified. Perhaps you should remove the-1part in thelmeone since you don't have a random intercept like in the linked post? – Frans Rodenburg Dec 03 '21 at 10:30lme()are not at all close to the intended variances. I tried removing the-1but this returns an errorError in pdConstruct.pdBlocked(object, form = form, nam = nam, data = data, : cannot have duplicated column names in a "pdMat" object– Jass Dec 06 '21 at 09:41lmeandlmerare fine. The disparity is becausesubj_idanditem_idare generated as integers. They must be factors forlmeto recognize as grouping levels instead of a continuous variable, unless they appear after|which is forbidden insidepdBlocked(). See more at https://stackoverflow.com/questions/36643713. – DrJerryTAO Feb 20 '24 at 02:55