First:
what is the difference between the following models in terms of their random-effects structure?
the model:
lmer(math ~ ses*sector + (ses+sector|sch.id), data = hsb)
specifies:
- random intercepts for
sch.id
- random slopes for
ses and sector, both varying within sch.id
- the 3 correlations between each of those.
The model:
lmer(math ~ ses*sector + (ses*sector|sch.id), data = hsb)
specifies:
- random intercepts for
sch.id
- random slopes for
ses, sector and the interaction between them, all varying within sch.id
- the 6 correlations between each of those.
The model:
lmer(math ~ ses*sector + (ses:sector|sch.id), data = hsb)
specifies:
- random intercepts for
sch.id
- random slopes for the interaction between
ses and sector - that is, a slope for each unique combination of ses and sector - varying within sch.id. Note that since sector is categorical with 2 levels, this will result in 2 random slopes estimates.
- the 3 correlations between each of those. The correlation between each of the random slopes and the intercept should answer the next question:
How should I define my model so that Corr, the correlation between intercepts and slopes, be separately reported for public & private sectors?
Edit:
So, it appears that
lmer(math ~ ses*sector + (ses:sector|sch.id), data = hsb)
will achieve this.
Here is a monte carlo simulation to demonstrate it:
We simulate two datasets, one for each sector, with 20 schools in each.
We set the standard deviations of the random intercepts and slopes to be the same among all schools, but the correlation between them to be different. Here we use -0.4 and 0.4 but the code can easily be adjusted:
First we set up a helper function to let us know if the models have converged normally and if not, why:
# helper function
# Has the model converged ?
hasConverged <- function (mm) {
if ( !class(mm)[1] == "lmerMod") stop("Error must pass a lmerMod object")
retval <- NULL
if(is.null(unlist(mm@optinfo$conv$lme4))) {
retval = 1
}
else {
if (isSingular(mm)) {
retval = 0
} else {
retval = -1
}
}
return(retval)
}
Now we set up the parameters and data needed:
n.school <- 20
dt0 <- expand.grid(sch.id = 1:n.school, ses = 1:10, sector = c("0"))
dt1 <- expand.grid(sch.id = (n.school+1):(n.school*2), ses = 1:10, sector = c("1"))
dt0$Y <- 1
X <- model.matrix( ~ ses, data = dt0)
myFormula <- "Y ~ ses + (ses | sch.id)"
foo <- lFormula(eval(myFormula), dt0)
Z <- t(as.matrix(foo$reTrms$Zt))
betas_0 <- c(1, 2)
betas_1 <- c(1, 3) # use different fixed effect for ses to create an interaction in the combined model
s1 <- 2 # SD of random intercepts
s2 <- 1 # SD of random slopes
rho_0 <- 0.4 # correlation between intercepts and slopes for sector 0
cormat_0 <- matrix(c(s1, rho_0, rho_0, s2), 2, 2) # correlation matrix for sector 0
covmat_0 <- lme4::sdcor2cov(cormat_0) # covariance matrix (needed for mvrnorm)
rho_1 <- -0.4 # correlation between intercepts and slopes for sector 1
cormat_1 <- matrix(c(s1, rho_1, rho_1, s2), 2, 2) # correlation matrix for sector 1
covmat_1 <- lme4::sdcor2cov(cormat_1) # covariance matrix (needed for mvrnorm)
then perform the simulations:
n.sim <- 250
vec.sim.rho_0.sep <- numeric(n.sim)
vec.sim.rho_1.sep <- numeric(n.sim)
vec.sim.rho_0.comb <- numeric(n.sim)
vec.sim.rho_1.comb <- numeric(n.sim)
for (i in 1:n.sim) {
set.seed(i)
simulate the random effects
umat_0 <- MASS::mvrnorm(n.school, c(0, 0), covmat_0, empirical = TRUE)
u_0 <- c(rbind(umat_0[, 1], umat_0[, 2])) # lme4 needs the random effects in this order interleaved)
e_0 <- rnorm(nrow(dt0), 0, 2) # residual error
dt0$Y <- X %% betas_0 + Z %% u_0 + e_0
fit the model for sector 0
m0 <- lmer(myFormula, dt0)
simulate random effects for sector 1
umat_1 <- MASS::mvrnorm(n.school, c(0, 0), covmat_1, empirical = TRUE)
u_1 <- c(rbind(umat_1[, 1], umat_1[, 2])) # lme4 needs the random effects in this order interleaved)
e_1 <- rnorm(nrow(dt1), 0, 2) # residual error
dt1$Y <- X %% betas_1 + Z %% u_1 + e_1
fit the model for sector 1
m1 <- lmer(myFormula, dt1)
combine data:
dt <- rbind(dt0, dt1)
fit the full model
m <- lmer(Y ~ ses*sector + (ses:sector | sch.id), dt)
If all models converged properly then extract the correlations
if (hasConverged(m0) == 1 && hasConverged(m1) == 1 && hasConverged(m) == 1) {
print(paste(i, "OK", sep = ": "))
vc_0 <- as.data.frame(VarCorr(m0))
vc_1 <- as.data.frame(VarCorr(m1))
vc <- as.data.frame(VarCorr(m))
vec.sim.rho_0.sep[i] <- vc_0<span class="math-container">$sdcor[3]
vec.sim.rho_1.sep[i] <- vc_1$sdcor[3]
vec.sim.rho_0.comb[i] <- vc$sdcor[4]
vec.sim.rho_1.comb[i] <- vc$sdcor[5]
} else {
print(paste(i, "Not converged", sep = ": "))
vec.sim.rho_0.sep[i] <- NA
vec.sim.rho_1.sep[i] <- NA
vec.sim.rho_0.comb[i] <- NA
vec.sim.rho_1.comb[i] <- NA
}
}
The first two vectors, vec.sim.rho_0.sep and vec.sim.rho_1.sep are the estimated correlations from the models run on the seperate datasets.
vec.sim.rho_0.comb and vec.sim.rho_1.comb are the estimated correlations from the models run on the full dataset
> mean(vec.sim.rho_0.sep, na.rm = TRUE) ; mean(vec.sim.rho_0.comb, na.rm = TRUE)
[1] 0.4355072
[1] 0.4320562
> mean(vec.sim.rho_1.sep, na.rm = TRUE) ; mean(vec.sim.rho_1.comb, na.rm = TRUE)
[1] -0.4015803
[1] -0.4039854
sch.id. But as you noted elsewhere, we can't take a cross-level interaction varying across level of the a grouping variable. Am I missing something? – rnorouzian Oct 08 '20 at 05:07sectoris a level-2 predictor that doesn't vary inid, I think the following models are partly incorrect, right?lmer(math ~ ses*sector + (ses+sector|sch.id), data = hsb) ; lmer(math ~ ses*sector + (ses*sector|sch.id), data = hsb)– rnorouzian Jul 22 '21 at 21:08