WHy is the over dispersion in this poisson and quasi-poisson the same?

Question

I have a zero inflated count data, on which I have run a poisson and quasi poisson reg using glm().

The output from a poisson model is as follows:

Call:
glm(formula = sum_count ~ location * treatment, family = poisson, 
    data = dat)
Deviance Residuals: 
    Min       1Q   Median       3Q      Max

-32.297  -21.360  -12.476    5.448   80.361
Coefficients:
                              Estimate Std. Error z value Pr(>|z|)

(Intercept)                    5.75215    0.01455  395.31   <2e-16 ***
locationNorth                 -0.30685    0.02235  -13.73   <2e-16 ***
treatmentclosed               -1.14865    0.03235  -35.51   <2e-16 ***
treatmentday                  -0.32222    0.02245  -14.35   <2e-16 ***
treatmentnight                 0.34802    0.01901   18.31   <2e-16 ***
locationNorth:treatmentclosed -0.73419    0.06081  -12.07   <2e-16 ***
locationNorth:treatmentday     1.13369    0.03032   37.39   <2e-16 ***
locationNorth:treatmentnight  -1.37528    0.03812  -36.08   <2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 67655  on 113  degrees of freedom

Residual deviance: 54990  on 106  degrees of freedom
AIC: 55474
Number of Fisher Scoring iterations: 7

The output from the quasipoisson model is as follows:

Call:
glm(formula = sum_count ~ location * treatment, family = quasipoisson, 
    data = dat)
Deviance Residuals: 
    Min       1Q   Median       3Q      Max

-32.297  -21.360  -12.476    5.448   80.361
Coefficients:
                              Estimate Std. Error t value Pr(>|t|)

(Intercept)                     5.7521     0.3753  15.327   <2e-16 ***
locationNorth                  -0.3068     0.5764  -0.532    0.596

treatmentclosed                -1.1486     0.8343  -1.377    0.171

treatmentday                   -0.3222     0.5790  -0.557    0.579

treatmentnight                  0.3480     0.4902   0.710    0.479

locationNorth:treatmentclosed  -0.7342     1.5685  -0.468    0.641

locationNorth:treatmentday      1.1337     0.7821   1.450    0.150

locationNorth:treatmentnight   -1.3753     0.9831  -1.399    0.165

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for quasipoisson family taken to be 665.2146)
Null deviance: 67655  on 113  degrees of freedom

Residual deviance: 54990  on 106  degrees of freedom
AIC: NA
Number of Fisher Scoring iterations: 7

My question is: (given my rudimentary understanding of count models), why is the over dispersion parameter for both models the same?

glm1$deviance/glm1$df.residual # estimating the dispersion
[1] 518.7702
glm1.q$deviance/glm1.q$df.residual # estimating the dispersion
[1] 518.7702

Any insight is highly welcome. Thank you!

score 2 · Accepted Answer · answered Sep 16 '22 at 22:43

tl;dr there's no need to check for over- (or under-)dispersion if you're already adjusting for overdispersion by fitting a quasi-likelihood model.

The deviance calculated in a quasi-Poisson model is not adjusted. The only place the quasi-Poisson model differs from the Poisson is in the output of the estimated dispersion and in the standard errors (which are adjusted based on the dispersion).

Here is the relevant code from summary.glm():

 if (is.null(dispersion)) 
        dispersion <- if (object$family$family %in% c("poisson", 
            "binomial")) 
            1
        else if (df.r > 0) {
            est.disp <- TRUE
            if (any(object$weights == 0)) 
                warning("observations with zero weight not used for calculating dispersion")
            sum((object$weights * object$residuals^2)[object$weights > 
                0])/df.r
        }

Ignoring the weights, you can see that the dispersion is defined as sum(object$residuals^2)/object$df.residual. If the stored residuals ($residuals) were deviance residuals, then their sum of squares would be exactly equal to the deviance. They're working residuals instead, which means that this computed dispersion won't be exactly the same as (deviance/df.residual), but for reasonably large data sets it should be very close. (See here for more on residual types.)

WHy is the over dispersion in this poisson and quasi-poisson the same?

1 Answers1