1

I have a binomial GLM:

all.fit <-  glm(Presence/Total~Season*ToD*Site, family = binomial, weights = Total, data = all.dt)

Call: glm(formula = Presence/Total ~ Season * ToD * Site, family = binomial, data = all.dt, weights = Total)

Coefficients: (Intercept) SeasonSpring
-4.66074 0.60478
SeasonSummer SeasonWinter
-0.82682 1.26868
ToDDay ToDDusk
0.09865 2.48420
ToDNight SiteKawau
2.65294 3.09495
SiteNoises SiteTawharanui
3.48048 2.94694
SiteTiritiri SeasonSpring:ToDDay
3.25976 0.90540
SeasonSummer:ToDDay SeasonWinter:ToDDay
2.16685 -0.12465
SeasonSpring:ToDDusk SeasonSummer:ToDDusk
-2.78750 -1.52114
SeasonWinter:ToDDusk SeasonSpring:ToDNight
-0.90286 -1.66162
SeasonSummer:ToDNight SeasonWinter:ToDNight
-1.85826 -0.21758
SeasonSpring:SiteKawau SeasonSummer:SiteKawau
-0.67217 1.24760
SeasonWinter:SiteKawau SeasonSpring:SiteNoises
-1.68237 -0.79023
SeasonSummer:SiteNoises SeasonWinter:SiteNoises
0.42570 -0.73055
SeasonSpring:SiteTawharanui SeasonSummer:SiteTawharanui
-0.25773 1.32456
SeasonWinter:SiteTawharanui SeasonSpring:SiteTiritiri
-0.51055 -0.51381
SeasonSummer:SiteTiritiri SeasonWinter:SiteTiritiri
0.95804 -1.29898
ToDDay:SiteKawau ToDDusk:SiteKawau
0.70933 -2.35320
ToDNight:SiteKawau ToDDay:SiteNoises
-5.33351 0.27852
ToDDusk:SiteNoises ToDNight:SiteNoises
-2.53814 -4.04432
ToDDay:SiteTawharanui ToDDusk:SiteTawharanui
-0.16455 -2.76975
ToDNight:SiteTawharanui ToDDay:SiteTiritiri
-2.62701 0.81969
ToDDusk:SiteTiritiri ToDNight:SiteTiritiri
-2.24361 -4.14062
SeasonSpring:ToDDay:SiteKawau SeasonSummer:ToDDay:SiteKawau
-0.19831 -1.61316
SeasonWinter:ToDDay:SiteKawau SeasonSpring:ToDDusk:SiteKawau
0.57193 2.48229
SeasonSummer:ToDDusk:SiteKawau SeasonWinter:ToDDusk:SiteKawau
1.37786 0.81702
SeasonSpring:ToDNight:SiteKawau SeasonSummer:ToDNight:SiteKawau
1.41405 2.06227
SeasonWinter:ToDNight:SiteKawau SeasonSpring:ToDDay:SiteNoises
0.08689 -0.28007
SeasonSummer:ToDDay:SiteNoises SeasonWinter:ToDDay:SiteNoises
-1.48112 -0.06044
SeasonSpring:ToDDusk:SiteNoises SeasonSummer:ToDDusk:SiteNoises
1.93847 0.63232
SeasonWinter:ToDDusk:SiteNoises SeasonSpring:ToDNight:SiteNoises
0.83278 0.69621
SeasonSummer:ToDNight:SiteNoises SeasonWinter:ToDNight:SiteNoises
0.06911 0.48690
SeasonSpring:ToDDay:SiteTawharanui SeasonSummer:ToDDay:SiteTawharanui
-0.23150 -1.96624
SeasonWinter:ToDDay:SiteTawharanui SeasonSpring:ToDDusk:SiteTawharanui
0.43487 2.04614
SeasonSummer:ToDDusk:SiteTawharanui SeasonWinter:ToDDusk:SiteTawharanui
0.59597 0.73280
SeasonSpring:ToDNight:SiteTawharanui SeasonSummer:ToDNight:SiteTawharanui
1.02467 1.19957
SeasonWinter:ToDNight:SiteTawharanui SeasonSpring:ToDDay:SiteTiritiri
-0.16571 -0.48960
SeasonSummer:ToDDay:SiteTiritiri SeasonWinter:ToDDay:SiteTiritiri
-1.63190 0.35354
SeasonSpring:ToDDusk:SiteTiritiri SeasonSummer:ToDDusk:SiteTiritiri
1.77115 0.79347
SeasonWinter:ToDDusk:SiteTiritiri SeasonSpring:ToDNight:SiteTiritiri
0.48411 1.51686
SeasonSummer:ToDNight:SiteTiritiri SeasonWinter:ToDNight:SiteTiritiri
1.27621 0.77385

Degrees of Freedom: 22998 Total (i.e. Null); 22919 Residual Null Deviance: 105700 Residual Deviance: 63170 AIC: 87340

And then I do an anova:

> anova(all.fit,test = "Chisq")
Analysis of Deviance Table

Model: binomial, link: logit

Response: Presence/Total

Terms added sequentially (first to last)

            Df Deviance Resid. Df Resid. Dev  Pr(&gt;Chi)    

NULL 22998 105714
Season 3 1005.8 22995 104709 < 2.2e-16 *** ToD 3 16404.9 22992 88304 < 2.2e-16 *** Site 4 7664.5 22988 80639 < 2.2e-16 *** Season:ToD 9 2793.5 22979 77846 < 2.2e-16 *** Season:Site 12 3890.0 22967 73956 < 2.2e-16 *** ToD:Site 12 9804.5 22955 64151 < 2.2e-16 *** Season:ToD:Site 36 981.0 22919 63170 < 2.2e-16 ***


Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

In this example, all of the interactioms are significant. But, I don't want to report a huge list of significant combinations of factors in my report. Since I guess if the interaction is important, the single effects don't matter? How could I state the significance of my findings more simply?

Could I use the output of anova(all.fit)? If I did so, what would I then be reporting?

  • Your code doesn't make sense. lmer() doesn't take a family= argument. Did you mean to use glmer()? However, you don't indicate any random effects in the formula (eg, something like +(1:id)). Do you have such? Otherwise, it would be better to use glm(). In fact, for the purposes of this question, glm() would be much preferred, since it's simpler & nothing in your Q relates to the greater complexities associated w/ the other functions. – gung - Reinstate Monica Jun 17 '21 at 14:42
  • @gung-ReinstateMonica Sorry, I have corrected my code above! I see that I can use

    anova(all.fit,test="LRT")

    but I am not sure what this is telling me. Or how I should report it. I am guessing I should include the df and the residual df for each factor, the significance, but should I also report the deviance?

    – Louise Wilson Jun 17 '21 at 20:50
  • Thanks. It might help if you post example output. Are you just wondering about which p-values to list in the results section, or how to best convey the meaning of it all to an audience? (Something else? What?) Who is the audience? What is the context? Is this for a poster at a scientific conference, a presentation (to whom?), a submission to a scientific journal, something else? More information will help. – gung - Reinstate Monica Jun 18 '21 at 11:05
  • 1
    @gung-ReinstateMonica Thanks! I have updated the question with example outputs. I want to report the significance of the interaction, and yes, to convey the meaning at a conference and within a report, a presentation to other scientists doing this same research. Aswell as stating the significance of the interaction, I want to be able to look at comparisons between the different levels of each variable, and so for example say, 'summer at one site had a higher proportion of 'presence' than winter'. Does that make sense? – Louise Wilson Jun 20 '21 at 06:19

2 Answers2

1

You have all categorical variables, and they are all nested within the top level, three-way interaction. Make sure you understand how categorical variables are coded by default and what the standard output means (see: here and here). In R, the output from summary(model) will differ from anova(model). The latter will output a sequential (Type I, see: here) analysis of deviance table (since you have a GLM). In general, p-values will differ between these two (see: here), although in your case the summary output doesn't actually include the test of any of the variables when taken as a whole (again, read up on how variables are coded and presented). In general, drop1(model, test="LRT") will better accord with people's intuitions, but since you only have categorical variables all nested within a three-way interaction, drop1 will only show the p-value for the interaction, which will be the same as listed in the output from anova. As it happens, that's really the only p-value you need. Any other p-values wouldn't really mean what people think they mean.

After that, you have 3x3x4=36 combinations of levels. I would simply make a table with the predicted means, or represent the means with a plot (like a dotplot or a barplot). You'll want to order the cells / dots / bars in such a way to allow you to conveniently point out the comparisons of interest.

  • 1
    Thank you! Turns out the p-value and deviance output from anova(all.fit,test = "Chisq") is the same as LRT output of drop1(model, test="LRT"). Should I report this as (LRT 36, 250944 = 976.67, p = < 0.005) ? – Louise Wilson Jun 22 '21 at 03:21
  • 1
    @LouiseWilson, in R "LRT" & "Chisq" are synonyms in drop1() & anova(). I would report it as, ($\chi^2=981.0$, $df=36$, $p<0.0001$. – gung - Reinstate Monica Jun 22 '21 at 11:25
0

If you want to hold on to your current presentation, it is necessary to include at least the third interaction term. If you cannot reduce that, it will not make sense to reduce any of the lower level interaction terms. To test the significance of time, you can remove time completely from from the model, and then test from one model to the other. Similar for season and site.

If you variables time, site, season are all discreate, you can combine them in a new variables and use anova, to show that not all groups can be considered as one. You can then test for reduction to a model without time, without site, and without season, to consider them separately. I am aware that this might not be exactly the answer you are looking for, but I think this is the simplest solution for interpretation.

I hope it answers your question.

Kirsten
  • 469