4

I have a general question about using deviance as a measure of fit in generalized linear models (i.e. multinomial, poisson etc.). I think I've gotten lost in all the equations and have totally missed the point somewhere.

Deviance is a measure of the difference between the fit of the saturated model and the suggested model. It can be kind of thought of as similar to residuals. So larger residuals means the model is having a harder time fitting the data. From my understanding that means that high deviance means a bad fit. This is why the Null deviance is always higher than the deviance of the model. So high deviance = bad fit.

However, in many cases deviance is also approximately distributed via a chi-squared statistic with degrees of freedom equal to df of the model. In this case it seems to me that high deviance = good fit as this would lead to a low p-value and lead you to conclude that the model contains information, i.e. is a good fit.

What am I missing here?

  • 3
    I don't think the deviance has a $\chi^2$ distribution. All I've read is that the difference in deviance between two nested models has a $\chi^2$ distribution. In that case, your reasoning makes sense. A high value of the difference in deviance means that the more complex model has much less deviance from perfection than the less complex model. Where did you read that the deviance of a model follows a $\chi^2$ distribution? – Dave Feb 28 '20 at 19:31
  • This post may help you understand the accurate definitions of deviance vs. deviance difference and clear confusions. – Zhanxiong Mar 26 '24 at 20:12

1 Answers1

0

What you are missing here is the hypothesis testing. We have:

$$H_0:\beta_{p+1} = \beta_{p+2} = ... = \beta_{q} = 0$$ $$\text{vs}$$ $$H_1:\beta_i \neq 0 \text{ for some } i \in \{p+1,...,q\}$$

where $q$ is the number of parameters in the saturated model and $p$ is the number of parameters in the suggested model.

As you have rightly pointed out, when there is high deviance, you will have a low p-value. Therefore you reject the null hypothesis. This means you reject the hypothesis that all the parameters you have eliminated from the saturated model to make your suggested model have coefficients zero. So, you are rejecting the suggested model.

Conversely, low deviance means you are unable to reject the suggested model.

This fits your observation that high deviance means a bad fit for the suggested model, as we are in fact rejecting the suggested model.

sunnydk
  • 57