20

I've often seen the advice for checking whether or not a Poisson model fit is over-dispersed involving dividing the residual deviance by the degrees of freedom. The resulting ratio should be "approximately 1".

The question is what range are we talking about for "approximate" - what is a ratio that should set off alarms to go consider alternative model forms?

StasK
  • 31,547
  • 2
  • 92
  • 179
Fomite
  • 23,134
  • 2
    Not an answer to this interesting question, but what I will often do is run several models (e.g. Poissson, NB, maybe zero-inflated versions) and compare them - both on AIC-type measures and on predicted values. – Peter Flom Sep 21 '12 at 10:36
  • This link might be of interest. Specially the section "Criteria For Assessing Goodness Of Fit". –  Sep 21 '12 at 14:57
  • @Procrastinator The link is a perfect example of what I'm talking about: "Then, if our model fits the data well, the ratio of the Deviance to DF, Value/DF, should be about one. Large ratio values may indicate model misspecification or an over-dispersed response variable; ratios less than one may also indicate model misspecification or an under-dispersed response variable." What's the range of "about 1"? 0.99 to 1.01? 0.75 to 2? – Fomite Sep 21 '12 at 19:41
  • https://www.r-bloggers.com/count-data-and-glms-choosing-among-poisson-negative-binomial-and-zero-inflated-models/ also has some information about how to answer this question, though @StasK's response covers it well enough. – flies Dec 09 '16 at 16:02

2 Answers2

14

10 is large... 1.01 is not. Since the variance of a $\chi^2_k$ is $2k$ (see Wikipedia), the standard deviation of a $\chi^2_k$ is $\sqrt{2k}$, and that of $\chi^2_k/k$ is $\sqrt{2/k}$. That's your measuring stick: for $\chi^2_{100}$, 1.01 is not large, but 2 is large (7 s.d.s away). For $\chi^2_{10,000}$, 1.01 is OK, but 1.1 is not (7 s.d.s away).

StasK
  • 31,547
  • 2
  • 92
  • 179
  • 2
    "so $\chi^2_k/k$ has a standard deviation of $\sqrt{2/k}$" can you direct me to somewhere that demonstrates this please? – baxx Mar 24 '19 at 19:18
  • https://www.amazon.com/Encyclopedia-Statistical-Sciences-Applications-Statistics/dp/0471150444. Sorry to be an asshole, but that's a reference distribution in statistical inference; if you don't understand it, you should not be working with generalized linear models such as Poisson. – StasK Apr 05 '19 at 17:07
  • 7
    For future reference you can, instead of the prefix / apology about being an asshole thing, just state the information and a reference. It would probably save you typing, and make you appear less of an asshole, which might be a novel experience. – baxx Apr 05 '19 at 17:11
  • See edit and the wikipedia reference. I have volunteered a few hundred answers over a few years, so I admit it is a bit difficult for me to have a really novel experience. – StasK Apr 05 '19 at 17:28
8

Asymptotically the deviance should be chi-square distributed with mean equal to the degrees of freedom. So divide it by its degrees of freedom & you should get about 1 if the data is not over-dispersed. To get a proper test just look up the deviance in chi-square tables - but note (a) that the chi square distribution is an approximation & (b) that a high value can indicate other kinds of lack of fit (which is perhaps why 'around 1' is considered good enough for government work).