10

Related to glm() in R, I saw a few post recommending modeling underdispersed data using the Conway–Maxwell–Poisson distribution, specifically with the R package CompGLM, however, I'm not sure I saw anybody confirming that the quasi-poisson cannot be used. Therefore, I ask: why not use quasi-poisson in glm for underdispersed data? After all, isn't the idea of quasi-poisson to go beyond the assumption that variance and mean are equal ? (and in the case of underdispersion, there are not equal).

Basically, I am running a glm(y ~ x, family=poisson) where x is a categorical variable and I am getting

Null deviance: 67.905  on 519  degrees of freedom
Residual deviance: 59.584  on 507  degrees of freedom 

Which strongly suggest underdispersion and I am therefore leaning towards a quasi-poisson solution.

Al3xEP
  • 233
  • 1
    The quasi-Poisson can definitely be used. I don't know enough about COM-Poisson to say whether, or under what conditions, it would be better. – mkt Sep 16 '19 at 05:46

1 Answers1

6

Quasi-likelihood theory is as valid with underdispersed data as it is with overdispersed data, so you could just go that way.

But, I would be careful, context matters a lot. While overdispersion is quite common, and is easily explained by simple mechanisms, that is not the case with underdispersion! For instance, extra, unmodeled (or unobserved) variation/inhomogeneities leads to overdispersion, but can never produce underdispersion. Causes for underdispersion are more difficult to come by, they usually have to do with a lack of independence. For one example see Causes for Underdispersion in Poisson Regression. One common cause of lack of independence is competion, an example I just come by is counts of territorial birds (that was from my daughters masters thesis in ecology)!

Some posts dealing with practical matters when modeling with underdispersion is