1

I am working on a linear regression model.

  • The complete model with 11 variables in total has a quite low adjusted R-squared ($R^2_{adj.}$) of 0.11.

  • 4 variables have a significant influence on the DV.

  • Alternative model: using stepwise model selection based on the AIC, a model with 7 variables turns out to have the best AIC. Further, $R^2_{adj.}$ improved comparatively much and equals 0.16 in the alternative model.

  • in the alternative model, the same 4 variables have an influence on the DV and a further, prior non-significant variable turns to a significant one.

This finding is somewhat not intuitive. Based on what I know so far about the $R^2_{adj.}$ measure is that it is a measure which indicates how much of variance is explained by the model.
It usually increases when adding more variables as it is likely that more variables explain more variance.

Conceptually, the complete model is theory deduced.

Does anyone have an idea, why the $R^2_{adj.}$ increases when removing variables?

After hours of research, I still did not find an appropriate explanation for this result.

Richard Hardy
  • 67,272
urmelf
  • 38

2 Answers2

3

You are using $R^{2}_{adj}$ incorrectly. It is intended to debias $R^2$ for a single pre-specified model. When used in the context of multiple models the degrees of freedom that needs to be inserted into the formula is the maximum number of parameters entertained across models. Then your problem also vanishes.

Frank Harrell
  • 91,879
  • 6
  • 178
  • 397
1

It is the $R^2$ which indicates how much of variance is explained by the model. You confuse $R^2$ and $R^2_{adj}$, you should read this to fully understand the difference between both of them : https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/adjusted-r2/

$R^2_{adj}$ can be use as a criterion for step-wise method, such as AIC or BIC : it penalizes the adding of variables.

Romain
  • 13