0

I know that the variance of a linear combination of correlated random variables can be generalized (as in Variance of linear combinations of correlated random variables). My question has to do with the variance of an additional linear combination of two or more such linear combinations. This is a different question from the one linked above because applying the variance-covariance matrix for a single linear combination is trivial, but in this extended case, you would need to apply the same variance-covariance matrix twice to difference strings of constants, so as a practical matter, being able to apply the terms in the correct order is essential.

Say you have a regression model with an output variance-covariance matrix. The variance for a given prediction would be $$ \operatorname{var}(Y_1) = \sum_{i=1}^n a_i^2\operatorname{var}(X_i) + 2\sum_{i=1}^n \sum_{j\colon j > i}^n a_ia_j\operatorname{cov}(X_i,X_j) $$ for all $a_i$ assigned to each random variable $X_i$. So for the regression, $a$ would be a vector of specific values to multiply by the vector of model coefficients $X$ (the $\beta$s) to sum up and get your prediction $Y_1$.

Now suppose you want to compare this prediction to a second prediction, $Y_2$, that comes from the same regression model, but uses a different collection of values, $b_i$. For example, if you wanted to compare individuals with a certain profile of covariate attributes against individuals with a different profile of covariate attributes. The variance for this new prediction would be $$ \operatorname{var}(Y_2) = \sum_{i=1}^n b_i^2\operatorname{var}(X_i) + 2\sum_{i=1}^n \sum_{j\colon j > i}^n b_ib_j\operatorname{cov}(X_i,X_j). $$

It is clear that the prediction of the linear combination $Y_1 - Y_2$ is simply an extended linear combination of the values for the first profile, the negative values for the second profile, and the model coefficients (repeated for the second profile). But as I'm doing the math, it appears that the variance of $Y_1 - Y_2$ is also simply a linear combination of $\operatorname{var}(Y_1)$ and the negative of $\operatorname{var}(Y_2)$ $$ \operatorname{var}(Y_1 - Y_2) = \sum_{i=1}^n a_i^2\operatorname{var}(X_i) + 2\sum_{i=1}^n \sum_{j\colon j > i}^n a_ia_j\operatorname{cov}(X_i,X_j) + \sum_{i=1}^n (-b_i)^2\operatorname{var}(X_i) + 2\sum_{i=1}^n \sum_{j\colon j > i}^n (-b_i)(-b_j)\operatorname{cov}(X_i,X_j) $$ which would be weird, because the predictions are clearly correlated, but I don't see it accounting for any additional covariance. Perhaps it is already accounted for in the individual profile covariances?

Is this accurate? Am I missing something? A proof of this (or disproof) would be welcome.

cgrafe
  • 101
  • 4
  • It seems that your usage of the term "estimate" is not very accurate. The equation "$Y_1 = a_1X_1 + \cdots + a_nX_n$" in your post is not an "estimate" of $Y_1$ but just a specification or definition of the random variable $Y_1$. – Zhanxiong Feb 16 '23 at 20:13
  • You can read this similar question to get some idea on how to reformulate your question. – Zhanxiong Feb 16 '23 at 20:14
  • 1
    Your algebra is incorrect. You have written the equivalent of "$\operatorname{var}(Y_1-Y_2)=\operatorname{var}(Y_1)-\operatorname{var}(Y_2),$" which is obviously wrong (consider the case where the latter variance exceeds the former: you will obtain a negative value). The variance of $Y_1-Y_2$ is obtained by applying the original formula you quote to the linear combination with coefficients $a_i-b_i.$ – whuber Feb 16 '23 at 21:01
  • @whuber Can you elaborate on your comment? I'm not seeing the step from $Y_1 - Y_2$ to $a_i - b_i$. – cgrafe Feb 16 '23 at 21:11
  • 1
    @whuber Your simplification to $\operatorname{var}(Y_1-Y_2)=\operatorname{var}(Y_1)-\operatorname{var}(Y_2)$ is not correct. The negatives cancel and you end up with $\operatorname{var}(Y_1-Y_2)=\operatorname{var}(Y_1)+\operatorname{var}(Y_2)$. – cgrafe Feb 16 '23 at 21:15
  • 1
    Whatever -- the algebra is still incorrect. Because $Y_1 = \sum a_i X_i$ and $Y_2= \sum b_i X_i,$ $Y_1 - Y_2 = \sum (a_i-b_i)X_i.$ – whuber Feb 16 '23 at 21:21

0 Answers0