0

A glm, where the response is Poisson distributed, is tested by using the analysis of deviance. In R the model looks like this:

glm(Y ~ A + B + C + A:B + A:C + B:C, family = poisson, data = data)

If I use the anova()-function on this model, which part of the output can be used to test the interaction effect of B:C at significance level $\alpha$? And how can I test the goodness-of-fit of the model

A + B + C + A:B + A:C

at significance level $\alpha$?

EdM
  • 92,183
  • 10
  • 92
  • 267
stats19
  • 61

1 Answers1

0

You have to be careful when you say "anova() function," as even in R that can have different meanings depending on the type of model and package.

For your evaluation of the single B:C interaction coefficient, analysis of deviance would best be a nested comparison of the first model you show, including that interaction term, against the second one with the same coefficients but with the B:C interaction omitted. You then evaluate the p-value against your pre-specified $\alpha$ cutoff.

According to the help page for anova.glm(), if you instead specify a single model then you get a sequential term-by-term analysis. That might lead to different apparent "significance" results if you change the order of predictors in the model.

The second part of the question is harder. You can compare different models with respect to goodness of fit, adjusted for the number of predictors in the model. In your example, the nested anova() just described can tell you whether adding the B:C interaction improves the last model in a way that justifies including that interaction. The p-value serves that purpose "at significance level $\alpha$."

If models being compared don't involve nested sets of predictors you can't use anova() for comparisons. Some suggest using measures like the Akaike Information Criterion in that case, but that's not universally accepted and there isn't a "significance level" for that.

A general goodness-of-fit test is to evaluate how well the modeling process works on multiple bootstrapped samples of the data. Again, though, there's no "significance level" for that. You have to gauge, based on your understanding of the subject matter, whether the model is good enough for your purposes.

EdM
  • 92,183
  • 10
  • 92
  • 267