I'm trying to model regional implicit bias using multilevel regression with poststratification as described in Hoover, J., & Dehghani, M. (2019). The big, the bad, and the ugly: Geographic estimation with flawed psychological data. Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000240.
The authors model Implicit Bias (imp_bias) as a function of six fixed effect county-level (contextual) variables (vote_prop.std + ... + latino_prop.std) and various random effects. Three of these random effects are geographical units: Counties nested within states nested within divisions. Though they are not explicitly specified as nested factors
(as (1 | division/state_fips/county_fips) would do), they should be handled as nested factors because every fips-code is unique to the specific geographical unit. The other three are demographic variables: education, sex, & age. In the paper the authors write that the demographic effects were estimated as random intercepts crossed with division and that they did cross them with division because otherwise the model didn't converge.
I am confused as to whether the code they report does achieve this:
iat.imp.mrp.2 <- lmer(imp_bias ~ 1 +
vote_prop.std +
...
latino_prop.std +
(1 | county_fips) +
(1 | state_fips) +
(1 | division:educ_3lvl) +
(1 | division:sex:age_4lvl),
data = iat.estimation.df, verbose=T,
control=lmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=2e5)))
To my understanding (1 | division:educ_3lvl) indicates that education is nested within division rather than crossed with division. And how does this model differ from the following?
iat.imp.mrp.2 <- lmer(imp_bias ~ 1 +
vote_prop.std +
...
latino_prop.std +
(1 | county_fips) +
(1 | state_fips) +
(1 | division) +
(1 | educ_3lvl) +
(1 | sex) +
(1 | age_4lvl),
data = iat.estimation.df, verbose=T,
control=lmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=2e5)))
To summarize, I don't really understand what the : does in the authors' code and would like to know how to specify with which of the three levels the demographic variables are crossed. Any help is really much appreciated!
lmermodels, the:asks for the random effect to be estimated across all levels of two (or more) factors. The authors might have used this because they did not have enough levels in one factor and when they combined it with another sensible factor, it gave them more levels. – Erik Ruzek Mar 13 '20 at 20:13