When it comes to select variable in multiple regression model using forward selection, should we add variables in the models according to its adjusted R square or t statistics/Sig?
Asked
Active
Viewed 929 times
1
Stephan Kolassa
- 123,354
whoisit
- 737
-
Neither. You should add variables by model testing using Likelihood Ratio Tests. You could also use information criterions like AIC, BIC. – Greenparker May 03 '16 at 13:06
-
What are you trying to achieve? Note that significance testing will be invalid after stepwise model selection, whether done through $R^2$, $t$ statistics, Likelihood Ratio tests or information criteria. – Stephan Kolassa May 03 '16 at 14:57
1 Answers
0
You can apply whatever rule you want. If you can validate that it works (e.g., holdout set, bootstrap, cross validation), then this kind of stepwise regression could be competitive with other predictive modeling techniques. More typical approaches would be based on Akaike or Bayesian information criteria, however, such as what is performed in the stepAIC function in R software.
Keep in mind, though, that extreme care is required to have the desired properties. Software might report values like adjusted $R^2$ and coefficient confidence intervals, but these are misleading. This is related to points 1, 2, 3, 4, and 7 here.
Dave
- 62,186