Bonferroni correction in generalised linear model

Question

I am running a generalised linear model in R. I have a single response variable and a maximum of 4 possible explanatory variables. I am adding each explanatory variable to the model sequentially, based on whether the coefficient is statistically significant.

If the coefficient for an explanatory variable is statistically significant at 0.05, the explanatory variable remains in the model. If the coefficient for an explanatory variable is NOT statistically significant at 0.05, the explanatory variable is removed from the model.

I am wondering if instead of using 0.05, I should be using a Bonferroni corrected P value? Should I use a threshold of 0.05/4 = 0.0125?

What is the goal of your analysis and why are you doing forward selection? — Macro, Jul 10 '12 at 18:01
If you are doing multiple testing and want to control the overall (so-called fmailywise significance level) you should adjust. Bonferroni is one conservative way but is not the only way that you can make the adjustment. If you are just using an informal procedure for deciding when to remove a variable from the model then you do not. But oyu don't have to use 0.05 either. people sometimes use 0.1 or 0.2. — Michael R. Chernick, Jul 10 '12 at 19:10
Related question: Is adjusting p-values in a multiple regression for multiple comparisons a good idea?. See this thread to get a feel of why stepwise variable selection might not necessarily be a good idea. — chl, Jul 10 '12 at 20:21

score 7 · Answer 1 · answered Dec 31 '23 at 10:11

When you test multiple hypotheses, the chance of a type I error increases. The Bonferroni correction is a conservative method to address this by adjusting the significance level. For 4 tests, you’d use 0.0125 (0.05/4) as your threshold instead of 0.05.

Regarding (forward) stepwise selection, this is a very bad idea can lead to models that overfit the data, biased estimates, and unstable variable selection. It doesn't account for the possibility that the best model might include variables that are not individually significant. See these threads on our site for further details:

Algorithms for automatic model selection

Understanding why stepwise selecton based on p-values is bad

Main Drawbacks of stepwise regression

(Why) Are stepwise regression coefficients biased?

this from from Andrew Gelmans's site:

Why we hate stepwise regression

and this article by Peter Flom:

Stopping stepwise: Why stepwise selection is bad and what you should use instead

Bonferroni correction in generalised linear model

1 Answers1