5

So I know that if the new predictor lies in the subspace spanned by the existing predictors, then $R^2$ will not increase. However, I recall reading that this is a sufficient condition, but not a necessary one to maintain the same $R^2$ when adding a new predictor, but I can't think of a situation in which $R^2$ can remain the same with a new predictor.

What are the other cases in which $R^2$ will remain the same?

24n8
  • 1,137
  • 1
    What happens when the new predictor is orthogonal to the response? – whuber Nov 24 '23 at 22:16
  • @whuber hmm, in simple linear regression, this would give a $R^2$ of 0, but in multiple linear regression, I'm not sure, but I don't think it necessarily keeps the $R^2$ constant? Even if the new predictor is orthogonal to the response, the new subspace spanned by the new predictor and existing predictors will change. – 24n8 Nov 24 '23 at 23:02
  • Your logic is strange! Don't think in terms of orthogonality of predictors: think in terms of the relationship between the subspace spanned by the predictors and the space generated by the response. – whuber Nov 25 '23 at 00:39
  • @whuber I'm not thinking about orthogonality of predictors? My comment is referring to the orthogonality between the predictors and the response. Suppose we the existing predictors are $x_1, x_2$ and when regressing $y$ on $x_1, x_2$, we get $R^2=a$ and $\hat{\beta}_1 = c_1, \hat{\beta}_2 = c_2$. Now suppose we want to add $x_3$ and that $x_3$ is orthogonal to $y$. Are you suggesting that when $y$ is regressed on $x_1, x_2, x_3$, we would still get $R^2=a$ and further $\hat{\beta}_1 = c_1, \hat{\beta}_2 = c_2$ remain the same and $\hat{\beta}_3 = 0$? – 24n8 Nov 25 '23 at 01:32
  • Your question is about $R^2.$ It therefore is not only germane but essential to consider the relationships between the predictors and the response. – whuber Nov 25 '23 at 15:46

1 Answers1

5

This diagram from the answer at https://stats.stackexchange.com/a/113207/919 abstractly (but accurately) depicts the space $X_1$ spanned by all the (current) explanatory variables, the response variable $y$ (as a vector), and its residual $y_{\cdot 1}.$

enter image description here

In particular (provided the residual is nonzero), within the vector space spanned by $X_1$ and the response $y,$ a basis for the orthogonal complement of $X_1$ is $y_{\cdot 1}.$ (In that case $y$ is also a basis for the orthogonal complement.)

The space $X_1$ and the vector $y$ are subsets of the space of all possible vectors (of length $n$ for $n$ data points). Because $1-R^2$ is defined as the squared length of $y_{\cdot 1}$ compared to the squared length of $y$ ($1-R^2 = ||y_{\cdot 1}||^2\,/\,||y||^2$),

$R^2$ increases if and only if introducing a new regressor $z$ shrinks the length of the residual.

Equivalently, upon projecting $z$ into $X_1$ (which means regressing $z$ against $X_1$), there must be a nonzero residual $z_{\cdot 1}$ which also has a nonzero component in the $y_{\cdot 1}$ direction. Another way to put this is

Introducing another explanatory variable $z$ will increase $R^2$ if and only if $z$ can be expressed as a linear combination $z = x + \beta y_{\cdot 1}$ where $x \in X_1$ and $\beta y_{\cdot 1} \ne 0.$

If we let $y_{\cdot 1}^\perp$ denote the set of vectors orthogonal to $y_{\cdot 1}$ (which includes $X_1$ as a subspace), then (provided $y_{\cdot 1}$ is nonzero) the set of regressors that increase $R^2$ is the set $\{y_{\cdot 1}^\perp + \beta y_{\cdot 1},\ \beta \ne 0\}.$ Accordingly, to answer the question directly,

The set of regressors that do not increase $R^2$ is $y_{\cdot 1}^\perp,$ the vectors orthogonal to the residual vector.

whuber
  • 322,774
  • "Because 2 is defined as the squared length of ⋅1 compared to the squared length of ". isn't this $1 - R^2$? if the residual and $y$ are the same length, then that implies $y$ is orthogonal to $X_1$, which implies $R^2 = 0$? – 24n8 Nov 28 '23 at 03:14
  • "The set of regressors that do not increase 2 is ⊥⋅1, the vectors orthogonal to the residual vector." Are you saying this is a necessary condition? If so, this doesn't seem to preclude the case in your original comment where the new predictor is orthogonal to the response? From what I can see $y_{\cdot 1}^\perp$ is defined to be orthogonal to the residual vector, but it isn't necessarily orthogonal to the response, unless the response is orthogonal to the residual vector – 24n8 Nov 28 '23 at 03:19
  • It's a necessary and sufficient condition. Thank you for correcting the confusion of $R^2$ and $1-R^2.$ – whuber Nov 28 '23 at 03:49
  • Got it, but I wanted to go back to the first comment you made on November 24th "What happens when the new predictor is orthogonal to the response?". If the new predictor ($z$ using your notation) is orthogonal to $y$, then it is not necessarily true that this won't increase $R^2$? – 24n8 Nov 28 '23 at 04:06
  • Another follow up is suppose $z$ lies in $y_{\cdot 1}^{\perp}$, what is the regression ($\hat{\beta}_z$) coefficient for $z$? Is it necessarily 0? – 24n8 Nov 28 '23 at 04:07
  • Actually not, because $z$ might be a linear combination of other regressors. Take, for instance, any of the original regressors for $z,$ say $x,$ and suppose its original coefficient is $b.$ Now you can set the coefficient of $z$ to any number you desire -- call it $a$ -- and change $b$ to $b-a,$ because $(b-a)x +az = bx,$ to obtain a solution. What one can say is that there exists a solution for which the coefficient is $0.$ When $z$ is not in the space generated by the existing regressors but is orthogonal to $y_{\cdot 1},$ then its coefficient will be zero. – whuber Nov 28 '23 at 04:18