1

I stumbled on this while doing MLR, and was curious as to why this happens. The adjusted R-squared is (if I understand correctly) supposed to be a way of comparing the predictive quality of models with different numbers of explanatory variables. In the second model, I've added a statistically insignificant variable (weight), which has apparently improved the model.

My only thought that this is because the point estimate of this value is not 0 - so there may be a 'significant' effect at a lower level. Is that right?

Model 1:

    Model 1
                                         Sum of           Mean
     Source                   DF        Squares         Square    F Value    Pr > F

     Model                     2     6017.30007     3008.65004      12.36    <.0001
     Error                   424         103221      243.44647
     Corrected Total         426         109239


                  Root MSE             15.60277    R-Square     0.0551
                  Dependent Mean      120.03044    Adj R-Sq     0.0506
                  Coeff Var            12.99901


                                   Parameter Estimates

             Parameter     Standard                        Variance
Variable   DF     Estimate        Error  t Value  Pr > |t|    Inflation    95% Confidence Limits

Intercept   1     75.85363      8.91793     8.51    <.0001            0     58.32478     93.38249
age         1      0.66112      0.15314     4.32    <.0001      1.00204      0.36011      0.96212
chol        1      1.86495      0.82213     2.27    0.0238      1.00204      0.24900      3.4809

Model 2 (with addition of insignificant variable):

                  Root MSE             15.58705    R-Square     0.0592
                  Dependent Mean      120.03044    Adj R-Sq     0.0525
                  Coeff Var            12.98591


                                   Parameter Estimates

             Parameter     Standard                        Variance
Variable   DF     Estimate        Error  t Value  Pr > |t|    Inflation    95% Confidence Limits

Intercept   1     57.69446     16.03325     3.60    0.0004            0     26.17970     89.20922
age         1      0.66180      0.15299     4.33    <.0001      1.00205      0.36110      0.96251
chol        1      2.02756      0.82993     2.44    0.0150      1.02320      0.39626      3.65885
weight      1      0.09687      0.07111     1.36    0.1738      1.02122     -0.04290      0.2366

2 Answers2

2

Citing Greene, Econometric Analysis, Theorem 3.7 (referring to the 5th edition here, though):

In a multiple regression, the adjusted $R^2$ will fall (rise) when the variable $x$ is deleted from the regression if the $t$-ratio associated with this variable is greater (less) than 1.

Since the $t$-ratio of weight is $1.36>1$, this had to happen.

0

I suggest you check the correlations among the predictors and compute semi-partial correlations. My guess is that weight is not strongly correlated with age or chol so it takes up part of the variance unaccounted for by the latter two, hence the increase in the (adjusted) $R^2$. Yet the variance weight accounts for above and beyond age and chol ($r^2_{y(weight.age,chol)}$) is not large enough to be significant, hence the p-value > 0.05 for weight.