19

I understand the proof that $$Var(aX+bY) = a^2Var(X) +b^2Var(Y) + 2abCov(X,Y), $$ but I don't understand how to prove the generalization to arbitrary linear combinations.

Let $a_i$ be scalars for $i\in {1,\dots ,n}$ so we have a vector $\underline a$, and $\underline X = X_i,\dots ,X_n$ be a vector of correlated random variables. Then $$ Var(a_1X_1 + \dots +a_nX_n) = \sum_{i=1}^n a_i^2 \sigma_i^2 + 2 \sum_{i=1}^n \sum_{j>i}^n a_i a_j \text{ Cov}(X_i,X_j)$$ How do we prove this? I imagine there are proofs in the summation notation and in vector notation.

utobi
  • 11,726
Hatshepsut
  • 1,699

5 Answers5

28

This is just an exercise in applying basic properties of sums, the linearity of expectation, and definitions of variance and covariance

\begin{align} \operatorname{var}\left(\sum_{i=1}^n a_i X_i\right) &= E\left[\left(\sum_{i=1}^n a_i X_i\right)^2\right] - \left(E\left[\sum_{i=1}^n a_i X_i\right]\right)^2 &\scriptstyle{\text{one definition of variance}}\\ &= E\left[\sum_{i=1}^n\sum_{j=1}^n a_i a_j X_iX_j\right] - \left(E\left[\sum_{i=1}^n a_i X_i\right]\right)^2 &\scriptstyle{\text{basic properties of sums}}\\ &= \sum_{i=1}^n\sum_{j=1}^n a_i a_j E[X_iX_j] - \left(\sum_{i=1}^n a_i E[X_i]\right)^2 &\scriptstyle{\text{linearity of expectation}}\\ &= \sum_{i=1}^n\sum_{j=1}^n a_i a_j E[X_iX_j] - \sum_{i=1}^n \sum_{j=1}^n a_ia_j E[X_i]E[X_j] &\scriptstyle{\text{basic properties of sums}}\\ &= \sum_{i=1}^n\sum_{j=1}^n a_i a_j \left(E[X_iX_j] - E[X_i]E[X_j]\right)&\scriptstyle{\text{combine the sums}}\\ &= \sum_{i=1}^n\sum_{j=1}^n a_i a_j\operatorname{cov}(X_i,X_j) &\scriptstyle{\text{apply a definition of covariance}}\\ &= \sum_{i=1}^n a_i^2\operatorname{var}(X_i) + 2\sum_{i=1}^n \sum_{j\colon j > i}^n a_ia_j\operatorname{cov}(X_i,X_j) &\scriptstyle{\text{re-arrange sum}}\\ \end{align} Note that in that last step, we have also identified $\operatorname{cov}(X_i,X_i)$ as the variance $\operatorname{var}(X_i)$.

Dilip Sarwate
  • 46,658
9

You can actually do it by recursion without using matrices:

Take the result for $\text{Var}(a_1X_1+Y_1)$ and let $Y_1=a_2X_2+Y_2$.

$\text{Var}(a_1X_1+Y_1)$

$\qquad=a_1^2\text{Var}(X_1)+2a_1\text{Cov}(X_1,Y_1)+\text{Var}(Y_1)$

$\qquad=a_1^2\text{Var}(X_1)+2a_1\text{Cov}(X_1,a_2X_2+Y_2)+\text{Var}(a_2X_2+Y_2)$

$\qquad=a_1^2\text{Var}(X_1)+2a_1a_2\text{Cov}(X_1,X_2)+2a_1\text{Cov}(X_1,Y_2)+\text{Var}(a_2X_2+Y_2)$

Then keep substituting $Y_{i-1}=a_iX_i+Y_i$ and using the same basic results, then at the last step use $Y_{n-1}=a_nX_n$

With vectors (so the result must be scalar):

$\text{Var}(a'\,X)=a'\,\text{Var}(X)\,a$

Or with a matrix (the result will be a variance-covariance matrix):

$\text{Var}(A\,X)=A\,\text{Var}(X)\,A'$

This has the advantage of giving covariances of the various linear combinations whose coefficients are the rows of $A$ on the off-diagonal elements in the result.

Even if you only know the univariate results, you can confirm these by checking element-by-element.

Glen_b
  • 282,281
9

Here is a slightly different proof based on matrix algebra.

Convention: a vector of the kind $(m,y,v,e,c,t,o,r)$ is a column vector unless otherwise stated.

Let $a = (a_1,\ldots,a_n)$, $\mu = (\mu_1,\ldots,\mu_n) = E(X)$ and set $Y = a_1X_1+\ldots+a_nX_n = a^\top X$. Note first that, by the linearity of the integral (or sum) $$E(Y) = E(a_1X_1+\ldots+a_nX_n) = a_1\mu_1+\cdots +a_n\mu_n = a^\top \mu.$$

Then \begin{align} \text{var}(Y) &= E(Y-E(Y))^2 = E\left(a_1X_1+\ldots+a_nX_n-E(a_1X_1+\ldots+a_nX_n)\right)^2\\ & = E\left[(a^\top X - a^\top\mu)(a^\top X - a^\top\mu)\right]\\ & = E\left[(a^\top X - a^\top\mu)(a^\top X - a^\top\mu)^\top\right]\\ & = E\left[a^\top(X - \mu)(a^\top(X - \mu))^\top\right]\\ & = a^\top E\left[(X - \mu)(X - \mu)^\top a\right] \\ & = a^\top E\left[(X - \mu)(X - \mu)^\top\right]a \\\tag{*} & = a^\top \operatorname{cov}(X)a. \end{align}

Here $\operatorname{cov}(X) = [\operatorname{cov}(X_i,X_j)]$, is the covariance matrix of $X$ and with entries $\operatorname{cov}(X_i,X_j)$ such that $\operatorname{cov}(X_i,X_i) = \operatorname{var}(X_i)$. Note the trick of placing a $^\top$ symbol in the third line of the last equation, which is valid since $r^\top = r$ for any real $r$. In passing from the 4th equality to the 5th equality and from the 5th equality to the 6th equality I have again used the linearity of the expectation.

Straightforward matrix multiplication will reveal that the desired result is nothing but the expanded version of the quadratic form (*).

User1865345
  • 8,202
utobi
  • 11,726
  • 2
    Utobi, hope you don't mind the minor edits. Of late, I am reading these simple yet intuitive posts and enjoying. +1. – User1865345 Feb 28 '23 at 21:41
4

Just for fun, proof by induction!

Let $P(k)$ be the statement that $Var[\sum_{i=1}^k a_iX_i] = \sum_{i=1}^k a_i^2\sigma_i^2 + 2\sum_{i=1}^k \sum _{j>i}^k a_ia_jCov[X_i, X_j]$

Then $P(2)$ is (trivially) true (you said you're happy with that in the question).

Let's assume P(k) is true. Thus,

$Var[\sum_{i=1}^{k+1} a_iX_i] = Var[\sum_{i=1}^{k} a_iX_i + a_{k+1}X_{k+1}]$

$=Var[\sum_{i=1}^{k} a_iX_i] + Var[a_{k+1}X_{k+1}] + 2 Cov[\sum_{i=1}^{k} a_iX_i,a_{k+1}X_{k+1}]$

$=\sum_{i=1}^k a_i^2\sigma_i^2 + 2\sum_{i=1}^k \sum _{j>i}^k a_ia_jCov[X_i, X_j]+ a_{k+1}^2\sigma_{k+1}^2 + 2Cov[\sum_{i=1}^{k} a_iX_i, a_{k+1}X_{k+1}]$

$=\sum_{i=1}^{k+1} a_i^2\sigma_i^2 + 2\sum_{i=1}^k \sum _{j>i}^k a_ia_jCov[X_i, X_j] + 2\sum_{i=1}^ka_ia_{k+1}Cov[X_i, X_{k+1}]$

$=\sum_{i=1}^{k+1} a_i^2\sigma_i^2 + 2\sum_{i=1}^{k+1} \sum _{j>i}^{k+1} a_ia_jCov[X_i, X_j]$

Thus $P(k+1)$ is true.

So, by induction,

$Var[\sum_{i=1}^n a_iX_i] = \sum_{i=1}^n a_i^2\sigma_i^2 + 2\sum_{i=1}^n \sum _{j>i}^n a_ia_jCov[X_i, X_j]$ for all integer $n \geq 2$.

3

Basically, the proof is the same as the first formula. I will prove it use a very brutal method.

$Var(a_1X_1+...+a_nX_n)=E[(a_1X_1+..a_nX_n)^2]-[E(a_1X_1+...+a_nXn)]^2 =E[(a_1X_1)^2+...+(a_nX_n)^2+2a_1a_2X_1X_2+2a_1a_3X_1X_3+...+2a_1a_nX_1X_n+...+2a_{n-1}a_nX_{n-1}X_n]-[a_1E(X1)+...a_nE(X_n)]^2 $

$=a_1^2E(X_1^2)+...+a_n^2E(X_n^2)+2a_1a_2E(X_1X_2)+...+2a_{n-1}a_nE(X_{n-1}X_n)-a_1^2[E(X_1)]^2-...-a_n^2[E(X_n)]^2-2a_1a_2E(X_1)E(X_2)-...-2a_{n-1}a_nE(X_{n-1})E(X_n) $

$=a_1^2E(X_1^2)-a_1^2[E(X_1)]^2+...+a_n^2E(X_n^2)-a_n^2[E(Xn)]^2+2a_1a_2E(X_1X_2)-2a_1a_2E(X_1)E(X_2)+...+2a_{n-1}a_nE(X_{n-1}X_n)-2a_{n-1}a_nE(X_{n-1})E(X_n)$

Next just note:

$a_n^2E(X_n^2)-a_n^2[E(X_n)]^2=a_n\sigma_n^2$

and

$2a_{n-1}a_nE(X_{n-1}X_n)-2a_{n-1}a_nE(X_{n-1})E(X_n)=2a_{n-1}a_nCov(X_{n-1},Xn)$

Deep North
  • 4,746