10

I am aware that when specifying the random structure for one factor (B) nested within another factor (A), we can use:

(1|A) + (1|A:B)

I am trying to understand section 2.3.1 in the online book chapter 2 by Douglas Bates: http://lme4.r-forge.r-project.org/book/Ch2.pdf which is using the InstEval dataset, which is an evaluation of lecturers by students at the Swiss Federal Institute for Technology–Zurich (ETH–Zurich):

> str(InstEval)
'data.frame': 73421 obs. of 7 variables:
$ s : Factor w/ 2972 levels "1","2","3","4",..: 1 1 1 1 2 2 3 3 3 ..
$ d : Factor w/ 1128 levels "1","6","7","8",..: 525 560 832 1068 6..
$ studage: Ord.factor w/ 4 levels "2"<"4"<"6"<"8": 1 1 1 1 1 1 1 1 1 1 ..
$ lectage: Ord.factor w/ 6 levels "1"<"2"<"3"<"4"<..: 2 1 2 2 1 1 1 1 1..
$ service: Factor w/ 2 levels "0","1": 1 2 1 2 1 1 2 1 1 1 ...
$ dept : Factor w/ 14 levels "15","5","10",..: 14 5 14 12 2 2 13 3 3 ..
$ y : int 5 2 5 3 2 4 4 5 5 4 ...

Factor s designates the student and d the instructor. The dept factor is the department for the course and service indicates whether the course was a service course taught to students from other departments. Thus these data are partially crossed.

The model fitted in the text is:

fm4 <- lmer(y ~ 1 + (1|s) + (1|d) + (1|dept:service), InstEval, REML=0)

My question is: why is the interaction fitted as a random intercept without (or instead of) the main effect also being fitted in this case, and in general: when would we fit random effects for an interaction but not for either of the main effects ? These are not nested factors, so I guess that has something to do with it, but why is dept not specified as a random intercept instead ? The text goes on to say

We could pursue other mixed-effects models here, such as using the dept factor and not the dept:service interaction to define random effects, but we will revisit these data in the next chapter and follow up on some of these variations there.

However, as far as I know, there is no Chapter 3 !!!!

Robert Long
  • 60,630

1 Answers1

4

Using the notation dept:service you get the interaction and main effects. You can see this also in the output where the number of groups are specified as 28.

Loading required package: Matrix
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: y ~ 1 + (1 | s) + (1 | d) + (1 | dept:service)
   Data: InstEval
  AIC       BIC    logLik  deviance  df.resid 

237663.3 237709.3 -118826.6 237653.3 73416

Scaled residuals: Min 1Q Median 3Q Max -2.9941 -0.7474 0.0400 0.7721 3.1124

Random effects: Groups Name Variance Std.Dev. s (Intercept) 0.10541 0.3247
d (Intercept) 0.26256 0.5124
dept:service (Intercept) 0.01213 0.1101
Residual 1.38495 1.1768
Number of obs: 73421, groups: s, 2972; d, 1128; dept:service, 28

Fixed effects: Estimate Std. Error t value (Intercept) 3.25521 0.02824 115.3

Using an interaction term without a main effect would make sense if the effect is only expected in the interaction.

For instance, let $x$ be number of kids in a family and $y$ number of schools in the area. Say there would be some effect that involves the number of schools that is only relevant when a family has kids, then it makes no sense to include number of schools as a main effect.