Using step-wise model reduction I have reduced my full model down (Based on AIC and model comparisons using the anova() function in R. This resulted in the following model:
m1_DD4 <- glm(GotoPB ~ Pb_type + ZF_Pb + ZF_NotPb, data = DF_DD_5, family = binomial(link = "logit"))
Pb_type is a factor with 4 levels. ZF_Pb and ZF_NotPb are numerical.
The summary of this model looks as follows:
Call:
glm(formula = GotoPB ~ Pb_type + ZF_Pb + ZF_NotPb, family = binomial(link = "logit"),
data = DF_DD_5)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.15642 -1.07365 -0.00006 1.02744 1.56573
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.013410 0.136495 0.098 0.92174
Pb_typeNG 0.152678 0.163990 0.931 0.35184
Pb_typeSilence -20.152316 452.945271 -0.044 0.96451
Pb_typeSong 0.221061 0.156019 1.417 0.15652
ZF_Pb 0.079849 0.013634 5.857 4.72e-09 ***
ZF_NotPb -0.027021 0.007097 -3.807 0.00014 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 2047.0 on 1559 degrees of freedom
Residual deviance: 1385.9 on 1554 degrees of freedom
AIC: 1397.9
Number of Fisher Scoring iterations: 18
From the model reduction/AIC I conclude that the model fit improves when Pb_type is included, and it should therefore be included. Yet here all are shown as non-significant. From what I understand/read, this is because the p-values reported here are from Wald tests, which tests if each coefficient is different from 0. Whereas the anova() function uses likelihood ratio tests, which indicates which model has a statistically better fit. Please correct me if I am wrong here.
I have also been told to not worry about this too much, and to rely on the outcome of anova(). But I would like to undestand why.
Under the assumption that Pb_type indeed matters, I now wish to determine how the different levels of the Pb_type differ when scored against each other using a PostHoc pair-wise comparison. I do this using the emmeans package, via the following formula:
emmeans::emmeans(m1_DD4, pairwise ~ Pb_type, adjust = "Tukey")
Which results in the following output:
$emmeans
Pb_type emmean SE df asymp.LCL asymp.UCL
DC 0.128 0.120 Inf -1.72e-01 0.427
NG 0.280 0.115 Inf -5.30e-03 0.566
Silence -20.025 452.945 Inf -1.15e+03 1108.223
Song 0.349 0.104 Inf 8.96e-02 0.608
Results are given on the logit (not the response) scale.
Confidence level used: 0.95
Conf-level adjustment: sidak method for 4 estimates
$contrasts
contrast estimate SE df z.ratio p.value
DC - NG -0.1527 0.164 Inf -0.931 0.7882
DC - Silence 20.1523 452.945 Inf 0.044 1.0000
DC - Song -0.2211 0.156 Inf -1.417 0.4887
NG - Silence 20.3050 452.945 Inf 0.045 1.0000
NG - Song -0.0684 0.152 Inf -0.449 0.9698
Silence - Song -20.3734 452.945 Inf -0.045 1.0000
Results are given on the log odds ratio (not the response) scale.
P value adjustment: tukey method for comparing a family of 4 estimates
Here a few things stand out:
My df = INF. From here, I undestand that this presents no issue and that this is how
emmeanslabels asymptotic results.In both my original model summary and here, it can be seen that for Silence there is a huge Standard error.
None of the values are significant.
This leaves me with 2 questions:
- Why is my standard error so high, and does this present a problem? To answer this it might be necessary to know more about my data. So in short: All
Pb_typetreatments are different playbacks, and the silence served as a control treatment. The response variableGotoPBis either Y/N and represents whether individuals approached the playback (Y) or not (N). In the case of silence, as there never truly was playback, all values for GotoPB are N. - How can none of the comparisons of
Pb_typebe significant, yet including it in the model results in a significantly better fit (with a much lower AIC). Is something wrong here? And if not, how should I interpret this?
Kind regards.