1

Given:

$$A\sim \mathrm N(\mu_1, \sigma_1)$$

$$B \sim \mathrm N(\mu_2, \sigma_2)$$

$$C \sim \mathrm N(\mu_3, \sigma_3)$$

$$X_1 = \alpha_1 \cdot A + \beta_1 \cdot B + \gamma_1 \cdot C$$

$$X_2 = \alpha_2 \cdot A + \beta_2 \cdot B + \gamma_2 \cdot C$$

$$Y = A$$

A, B, C and have known pairwise covariances / correlations.

How can I calculate the percent of the variance of $Y$ that is explained by $X_1$ and $X_2$? I can currently generate random samples for A, B, and C and then run a regression to find the r-squared, but I was hoping to find a closed-form solution.

Or alternatively, what amount of variance of $Y$ is unique, that is, what amount of variance is not explained by $X_1$ and $X_2$?

Edit: Simplified my problem too much, and jbowman correctly pointed out all the variance is explained if X1 and X2 are linear combinations of just two random variables, so added a third random variable.

Albeit
  • 213
  • This general question is usually answered by regressing $Y$ against $(X_1,X_2).$ You have supplied all the information needed to do that. BTW, the distributional shapes are irrelevant: all that matters is the covariance matrix of $(X_1,X_2,Y).$ See https://stats.stackexchange.com/questions/107597 for details. – whuber Mar 04 '23 at 13:30
  • 1
    As whuber said, regressing seems to be the way to do it. Given the covariance or correlation matrix, the regression coefficients and the coefficient of determination (r-squared) can be calculated directly. In the case of the two explanatory variables, there are relatively simple equations to calculate the desired values as described here: https://psychometroscar.com/r-squared-in-terms-of-basic-correlations/ – Albeit Mar 05 '23 at 19:26

1 Answers1

2

All of it is.

  1. Multiply $X_1$ by $c = \beta_2 / \beta_1$ to get:

$$cX_1 = c\alpha_1A + \beta_2B$$

  1. Now subtract $X_2$ from this:

$$cX_1 - X_2 = (c\alpha_1-\alpha_2)A$$

  1. Now divide both sides by $c\alpha_1-\alpha_2$ to get:

$${cX_1-X_2 \over c\alpha_1-\alpha_2} = A$$

So...

$$Y = {c\over c\alpha_1-\alpha_2}X_1 -{1\over c\alpha_1 - \alpha_2}X_2$$

with no error left over.

jbowman
  • 38,614
  • Of course. I simplified my actual problem too much, unfortunately. I have edited the question to add in a third random variable. Sorry! – Albeit Mar 04 '23 at 06:02
  • 2
    +1. This answer assumes that $\beta_1\neq 0$ and $c\alpha_1-\alpha_2\neq 0$. If $\beta_1=0$, then the problem is trivial. $c\alpha_1-\alpha_2=0$ means that $X_1$ and $X_2$ are proportional, and $A$ and $B$ cannot be recovered from $X_1$ and $X_2$. – Taladris Mar 04 '23 at 07:03