8

I'm considering a mixed-effects model to try to understand factors that influence the number of ticks sampled on wild rodents. My data is nested so that I have one tick count per rodent, multiple rodents per site and multiple sites per year (sites are repeated across years but not every site is present in all years).

So far, without fixed effects, my model looks something like this:

glmer(Ticks~(1|Year)+(1|Site), family=poisson, data=tickdata)

I understand that this will account for random variation between sites and between years. My main concern is that I think the effect of specific sites will change between years, e.g. some sites may recieve more rainfall than others, but the identity of the sites with the highest rainfall differs between years.

So do I need some sort of interaction term? Maybe something like this:

glmer(Ticks~(1|Year)+(1|Site)+(1|Year:Site), family=poisson, data=tickdata) 
Claire
  • 103
  • I wonder if it makes more sense to nest year at the bottom of the hierarchy: observation in year nested in ticks, nested in rodents, nested in sites? – Alexis May 06 '14 at 14:28
  • @Alexis, you don't need to nest in this case (and probably shouldn't) – Ben Bolker May 06 '14 at 21:35
  • Because...? Are there no year-specific determinants of tick prevalence? Are there no site-specific determinants of tick prevalence? Rodent-specific? Ignoring such structural heterogeneity of variation surely guarantees biased estimates of the standard errors? – Alexis May 06 '14 at 21:49
  • 1
    I mean that you can cross random effects rather than nesting: exactly as the OP said, there are site-specific effects, year-specific effects, and site-by-year effects. (You could also consider rodent-specific effects, which would be observation-level effects/characterizing overdispersion if each rodent is measured once. Otherwise (if rodents are recaptured/recounted) you could also include observation-level effects). – Ben Bolker May 07 '14 at 01:43

1 Answers1

11

Have you tried it? That sounds like it should be fine.

set.seed(101)
## generate fully crossed design:
d <- expand.grid(Year=2000:2010,Site=1:30)
## sample 70% of the site/year comb to induce lack of balance
d <- d[sample(1:nrow(d),size=round(0.7*nrow(d))),]
## now get Poisson-distributed number of obs per site/year
library(plyr)
d <- ddply(d,c("Site","Year"),transform,rep=seq(rpois(1,lambda=10)))
library(lme4)
d$ticks <- simulate(~1+(1|Year)+(1|Site)+(1|Year:Site),
                    family=poisson,newdata=d,
                    newparams=list(beta=2, ## mean(log(ticks))=2
                               theta=c(1,1,1)))[[1]]
mm <- glmer(ticks~1+(1|Year)+(1|Site)+(1|Year:Site),
                    family=poisson,data=d)

We get out approximately what we put in -- equal variances at each level:

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: ticks ~ 1 + (1 | Year) + (1 | Site) + (1 | Year:Site)
##    Data: d
## 
##      AIC      BIC   logLik deviance df.resid 
##  12487.3  12510.2  -6239.7  12479.3     2267 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.9944 -0.6842 -0.0726  0.6010  3.8532 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev.
##  Year:Site (Intercept) 1.0818   1.0401  
##  Site      (Intercept) 1.0490   1.0242  
##  Year      (Intercept) 0.9787   0.9893  
## Number of obs: 2271, groups:  Year:Site, 231 Site, 30 Year, 11
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   2.1952     0.3593   6.109    1e-09 ***
## ---
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

You may want to include an observation-level random effect to allow for overdispersion (see the "grouse ticks" example in http://rpubs.com/bbolker/glmmchapter)

Ben Bolker
  • 43,543