0

I am working with a zero-inflated negative binomial model in R, using the glmmTMB package. My main goal is to investigate if there is a significant difference in the amount of times a grassland field was visited by birds (response variable) during different stages of a mowing procedure (treatment: pre, during or post-procedure. One directly follows the other).

A simple model only with these parameters (below) showed significant differences:

zinb_simple <- glmmTMB(
  response ~ treatment
  + (1 | field_id)
  + (1 | bird_id),
  data = df_for_analysis,
  family = nbinom2,
  ziformula = ~.,
  offset = offset,
  control = glmmTMBControl(
    rank_check = "adjust"
  )
)

I am now investigating if any extra parameters potentially influence their response to the treatment. I am using three and all of them display the behaviour in question: period_cut (when in the year the treatment happened: early, mid or late); field_size (size of the field, in m2) and rain (rain in mm during the three stages of treatment). Because the focus of the analysis is on the interaction between a parameter and the treatment, I first wrote a model with parameter : treatment:

zinb_full <- glmmTMB(
  response ~ treatment
  + period_cut: treatment
  + field_size: treatment
  + rain: treatment
  + (1 | field_id),
  data = df_for_analysis,
  family = nbinom2,
  ziformula = ~.
  + (1 | bird_id),
  offset = offset,
  control = glmmTMBControl(
    rank_check = "adjust"
  )
)

bird_id is now only present in zi because the model couldn't converge if bird_id was present in the conditional part. Variance from bird_id in cond was very small (e-08).

This model gave me significancy for some values within the parameters (example from period_cut only for shortness):

Conditional model:
                                           Estimate Std. Error z value Pr(>|z|)
(Intercept)                                4.255e-01  3.867e-01   1.100 0.271205    
treatmentduring_treatment                  6.215e-01  4.099e-01   1.516 0.129520
treatmentpost_treatment                    1.275e+00  4.209e-01   3.031 0.002441 **
treatmentpre_treatment:period_cutearly     -9.418e-01  6.638e-01  -1.419 0.155952
treatmentduring_treatment:period_cutearly  -1.162e+00  4.043e-01  -2.873 0.004068 ** 
treatmentpost_treatment:period_cutearly    -1.184e+00  4.092e-01  -2.893 0.003813 **
treatmentpre_treatment:period_cutlate      -9.891e-01  1.161e+00  -0.852 0.394145

Zero-inflation model: Estimate Std. Error z value Pr(>|z|) (Intercept) 9.951e-01 4.884e-01 2.038 0.04159 * treatmentduring_treatment -3.600e+00 8.301e-01 -4.336 1.45e-05 *** treatmentpost_treatment -3.161e+00 6.779e-01 -4.663 3.12e-06 *** treatmentpre_treatment:period_cutearly 5.187e-02 7.610e-01 0.068 0.94566 treatmentduring_treatmentd:period_cutearly 5.120e-01 1.643e+00 0.312 0.75531 treatmentpost_treatment:period_cutearly 5.526e-01 8.171e-01 0.676 0.49886
treatmentpre_treatment:period_cutlate 1.391e+00 1.004e+00 1.386 0.16580

However, all significancy disappears if I use parameter * treatment. The only one that remains is the sole treatment:

Conditional model:
                                            Estimate Std. Error z value Pr(>|z|)
(Intercept)                                 4.255e-01  3.867e-01   1.100  0.27121
treatmentduring_treatment                   6.215e-01  4.099e-01   1.516  0.12952
treatmentpost_treatment                     1.275e+00  4.209e-01   3.031  0.00244 **
period_cutearly                            -9.418e-01  6.638e-01  -1.419  0.15594
period_cutlate                             -9.890e-01  1.161e+00  -0.852  0.39419  
treatmentduring_treatment:period_cutearly  -2.197e-01  7.286e-01  -0.302  0.76296
treatmentpost_treatment:period_cutearly    -2.422e-01  7.575e-01  -0.320  0.74913
treatmentduring_treatment:period_cutlate    1.486e+00  1.186e+00   1.253  0.21027   
treatmentpost_treatmentd:period_cutlate     1.417e+00  1.196e+00   1.185  0.23618

Zero-inflation model: Estimate Std. Error z value Pr(>|z|)
(Intercept) 9.951e-01 4.884e-01 2.038 0.0416 * treatmentduring_treatment -3.600e+00 8.301e-01 -4.336 1.45e-05 *** treatmentpost_treatment -3.161e+00 6.779e-01 -4.663 3.12e-06 *** period_cutearly 5.181e-02 7.610e-01 0.068 0.9457
period_cutlate 1.391e+00 1.004e+00 1.386 0.1658
treatmentduring_treatment:period_cutearly 4.604e-01 1.721e+00 0.268 0.7890 treatmentpost_treatment:period_cutearly 5.008e-01 1.042e+00 0.481 0.6308 treatmentduring_treatment:period_cutlate 6.404e-01 1.204e+00 0.532 0.5949 treatmentpost_treatment:period_cutlate -6.233e-01 1.159e+00 -0.538 0.5908

I tried some seeds but have not been able to generate a working minimum reproducible example. It may be that my model is already working with a small dataset and shrinking it further yields in convergence issues. The table below is just an example of how values looks like:

   bird_id  field_id response treatment period_cut   field_size  rain   offset
   <fct>    <fct>    <dbl>   <fct>      <fct>          <dbl>    <dbl>   <dbl>
 1 koemo3   KM142        0   post_tre…  mid            -19241.  -1.58     1.95
 2 sophie   AL839        0   post_tre…  mid            -18716.   5.61     1.39
 3 nume7    AL539        3   during_t…  early           20483.  -0.148    1.95
 4 sophie   AL1267       0   post_tre…  mid            -18479.  -0.0342   1.95
 5 koemo3   KM88        27   during_t…  mid            -13578.  -1.58     1.95
 6 koemo3   KM585        3   post_tre…  mid              2811.  -1.58     1.95
 7 koemo3   KM652        0   during_t…  mid             16366.  -1.58     1.95
 8 koemo3   KM184        0   post_tre…  late            -9154.   0.0230   1.61
 9 koemo4   KM492        6   post_tre…  mid              5483.  -1.58     1.95
10 wiesmet1 AL834        0   post_tre…  mid            -20968.  -1.58     1.95

Continuous variables have been centered and categorical variables have been factorised. The offset is alreadt at log().

Statistics is not my field. I learnt GLMMs about a month ago and my biostatistics professor doesn't work with GLMMs. His guess is that this may be happening because both treatment and period_cut are temporal variables, but that doesn't explain why field_size * treatment behaves the same way. Furthermore, period_cut refers to a period in the year and the treatment happens "within" period_cut, so in a way these temporalities are different. I cannot see a reason for a relationship between the parameters.

Since I cannot offer a reproducible example, could someone offer a theoretical explanation for the observed behaviour? Why does the significance of parameter : treatment disappear with parameter * treatment?

0 Answers0