I am using the R package lme4 to build a mixed-effects model. My data is set up in the following way:
set.seed(1)
df = data.frame(group1 = factor(c(rep(1,36),rep(2,36),rep(3,36),rep(4,36),rep(5,36))),
group2 = factor(rep(c(rep(1,12),rep(2,12),rep(3,12)),5)),
uniqueid = seq(from=1,to=180),
value = rnorm(n = 180, mean = 10, sd = 2))
I have two grouping variables, group1 and group2. group1 consists of 5 different categories, and group2 consists of 3 different categories. This creates a total of 15 unique combinations of group1 and group2, and 12 unique observations within each unique combination, like so:
#xtabs(~group1+group2,df)
group2
group1 1 2 3
1 12 12 12
2 12 12 12
3 12 12 12
4 12 12 12
5 12 12 12
My goal is to build a mixed-effects model to get the fixed-effects parameters of being included in group1 and group2, as the 12 samples within each unique combination are not independent.
Intuitively, I thought to build a model like so:
lme4::lmer(data=df,
formula=value ~ group1 + group2 + (1|group1) + (1|group1:group2))
where group2 is nested within group1, however, there is nothing inherent about the data structure that suggests it could not also be:
lme4::lmer(data=df,
formula=value ~ group1 + group2 + (1|group2) + (1|group2:group1))
where group1 is nested within group2. This leads me to believe that I am actually dealing with crossed effects, where the proper model would be built like so:
lme4::lmer(data=df,
formula=value ~ group1 + group2 + (1|group1) + (1|group2))
Other reasons for believing I am dealing with crossed effects are that my group2 categories exist within all levels of the group1 categories, and vice versa. There are not "unique" group2 categories that only exist within certain categories of group1, although the observations within combination are unique.
EDIT (an analogy): An analogous situation would be if there were 5 unique race categories, say, "White", "Black", "American Indian", "Asian/Pacific Islander", and "Other," along with 3 unique ethnicity categories say, "Hispanic", "Non-Hispanic", and "Other." This would allow for 15 unique combinations. Within each unique combination, there are 12 samples that are dependent within the unique combinations, for a total of 180 samples. This way, there are a total of 60 individuals that are "Hispanic", broken up such that 12 fall into each race category.
The confusion I am running into is related to the answer on a related post: Crossed vs nested random effects: how do they differ and how are they specified correctly in lme4? where it uses the image:
to describe a crossed random-effect scenario. Here, it seems that the observations within each "Class" are being shared by the "Schools", but this situation differs from my scenario. In my scenario (using the race/ethnicity example), there are unique observations for each combination of race and ethnicity (60 observations for each ethnicity, split evenly into 12 observations for each race.)
