2

Somewhere I saw that the coefficient-of-determination for the multiple linear regression is given by the following quadratic form:

$$R^2 = \boldsymbol{r}_{\mathbf{y},\mathbf{x}}^\text{T} \boldsymbol{r}_{\mathbf{x},\mathbf{x}}^{-1} \boldsymbol{r}_{\mathbf{y},\mathbf{x}}$$

where $r_{y,x}$ is the correlation matrix of x with y and $r_{x,x}$ is the correlation matrix of x.

I don't see why this is the case and I haven't found a proof anywhere. Can someone show me why this is? It doesn't seem immediately related to the ratio of sums of squares.

Kashif
  • 507
  • 1
    What does the correlation matrix of x with y mean when there are multiple features, as [tag:multiple-regression] implies? // If the situation is just a simple linear regression with only one feature, then $r_{x, x}^{-1} = 1$, and there is no need to include that middle term. – Dave Jul 13 '22 at 18:35
  • There's a detailed answer in another thread: [https://stats.stackexchange.com/questions/314926/can-you-calculate-r2-from-correlation-coefficents-in-multiple-linear-regressi/364630#364630] – Eli Jul 13 '22 at 18:56
  • 1
    @Eli Yes, but anent this point all that answer states is "it can be shown." The demonstration would follow the one I gave for the covariance-based solution at https://stats.stackexchange.com/questions/107597: after standardizing the variables, all covariances become correlations and you're most of the way there. – whuber Jul 13 '22 at 18:58

0 Answers0