Suppose I run a bidirectional stepwise in R with the model:
step(glm(y ~ a + b + c + d, poisson))
And the result may be:
y ~ d + c
null deviance: 263.6
100 residual degrees of freedom
residual deviance: 132.9
AIC: 648.3
I read that if you run the line:
1-pchisq(residual deviance, residual df)
and the result is "significant" (below 0.05), you need a better model.
But, if the stepwise() choose the better model using the Akaike criterion, it means that I can't have a better model? what if I don't have any other variable or arrange of variables?
The "best" model chosen by the stepwise it is not necessarily a good model? How can I know this?
Maybe is a very basic question, but I dont get it. Can anyone help me to understand the basics of this?
Stepwise does not necessarily give the "best" model for any definition of "best"; all subsets regression will give it for some definition of "best". But substantive knowledge is always key, and there may not be any one best model.
– Peter Flom Aug 28 '12 at 10:21