4

As a follow-up to this question, I have the following data:

   Site Treatment Survival
1   BED        DN      1.0
2   BED        DN      1.0
3   BED        DN      1.0
4   BED        MB      1.0
5   BED        MB      1.0
6   BED        MB      0.9
7   BED    Forest      0.4
8   BED    Forest      0.5
9   BED    Forest      0.4
10  BRO        DN      0.9
11  BRO        DN      1.0
12  BRO        DN      1.0
13  BRO        MB      1.0
14  BRO        MB      1.0
15  BRO        MB      1.0
16  BRO    Forest      1.0
17  BRO    Forest      1.0
18  BRO    Forest      1.0
19  LAP        DN      0.8
20  LAP        DN      0.4
21  LAP        DN      0.6
22  LAP        MB      0.5
23  LAP        MB      1.0
24  LAP        MB      0.7
25  LAP    Forest      0.2
26  LAP    Forest      0.2
27  LAP    Forest      0.4

on which I ran a binomial glm :

> glm.out <- glm(Survival~Site*Treatment, data=surv,
    family="binomial", weights=rep(10, nrow(surv)))
> anova(glm.out, test="Chisq")

Analysis of Deviance Table
Model: binomial, link: logit
Response: Survival
Terms added sequentially (first to last)

               Df Deviance Resid. Df Resid. Dev P(>|Chi|)    
NULL                              26    138.254              
Site            2   63.098        24     75.155 1.988e-14 ***
Treatment       2   42.991        22     32.164 4.620e-10 ***
Site:Treatment  4   13.874        18     18.290  0.007707 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

All effects are significant, so now I want to do post-hoc comparisons. I have discovered the glht function in the multcomp package, which seems to do what I want. I read somewhere that with 2 independent variables, you should set interaction_average=TRUE to get a result equivalent to a TukeyHSD for a linear model.

> Treat.comp <- glht(glm.out, mcp(Treatment="Tukey", interaction_average=TRUE))
> summary(Treat.comp)

which gives me these strange results:

     Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: glm(formula = Survival ~ Site * Treatment, family = "binomial", 
    data = surv.san, weights = rep(10, nrow(surv.san)))

Linear Hypotheses:
                 Estimate Std. Error z value Pr(>|z|)
DN - MB == 0       -0.202   3316.127   0.000        1
Forest - MB == 0   -1.886   3316.127  -0.001        1
Forest - DN == 0   -1.684   3316.127  -0.001        1
(Adjusted p values reported -- single-step method)

Why are my P-values = 1 for all comparisons even though the Treatment effect was highly significant?

If I try using glht with interaction_average=FALSE I get more reasonable results but with a warning message:

Warning message:
In mcp2matrix(model, linfct = linfct) :
  covariate interactions found -- default contrast might be inappropriate

Can someone help me to understand what I am doing wrong? Thank-you!!!

KTWillow
  • 211
  • 1
  • 3
  • 5

1 Answers1

4

This is because the glht function is only evaluating differences among treatments, as specified. It prints a warning because the interaction term in your model implies that it can't just compare means by level of treatment in a straightforward manner.

I would suggest you add another column to your data set that combines site by treatment. Do this by writing

surv$sxt<- interaction(surv$Site,surv$Treatment)

Now the object surv$sxt within your dataset surv has a unique level for every combination of site and treatment. Write a new model for the purpose of post-hoc comparisons as

glm.posthoc <- glm(Survival~-1 + sxt, data=surv,family="binomial", weights=rep(10, nrow(surv)))

Then, analyze post hoc comparisons as:

Treat.comp<-glht(glm.posthoc,mcp(sxt='Tukey'))

summary(Treat.comp)

which should print all comparisons.

colin
  • 1,232
  • 1
  • 14
  • 31
  • Could you please explain why glm is being used instead of aov? – MAPK Feb 22 '19 at 18:29
  • @MAPK the original asker was using glm() and so the answer is for a glm() type model. Haven't done this in a while, not sure how it would extend to aov(). – colin Feb 22 '19 at 19:54