1

Consider the case of a multiple regression model, with about 10 regressor and very few observations (about 15). I have to choose 10 out of 20 available regressors, to be included in the model. In many cases, I obtain a model with all significant regressors and an Adjusted R-squared close to 1. If I take out 1 or 2 regressors, the new model that I obtain has almost no significant coefficients.

I'm aware such a sample is no good at all to build a regression model. But, I'm just curious to know what's driving the Adjusted R-squared to be close to 1. Can you explain this effect?

1 Answers1

1

Adding a regressor will increase R2adj if the absolute value of the t-statistic associated with that regressor is greater than one. You don't provide details about your data - is there collinearity in predictors (which would explain low r2 until so many predictors are added that they use up all df of the model)? which R2 formulation is used? How are the models chosen? Perhaps this discussion of over-fitting will be useful: Explanation of minimum observations for multiple regression

katya
  • 2,142
  • Thanks katya. Yes there is multicollinearity. Adjusted R2 is the one calculated by Stata or SPSS. – Forinstance Oct 12 '14 at 21:09
  • Also, can you better explain what you mean by "using up all the degrees of freedom of the model"? – Forinstance Oct 12 '14 at 21:18
  • As mentioned in that comment linked above, with p possible variables you have 2^(p+1)−1 possible models to fit. It is a multi-dimensional example of regression on 2 points - you can always get a perfect linear fit to just 2 points. So need to reduce the number of regressors for this model, 'a priori'. – katya Oct 13 '14 at 16:47