9

I have what I'm afraid is a simple stats problem that is stumping me. I have two random variables, X and Y, independently normally distributed:

 X ~ N(0, sigmaX)
 Y ~ N(0, sigmaY).

I observe the sum of these two variables, Z = X+Y, and want to develop a conditional expectation on X given the sum. A colleague said, "ah, yes, classic signal-extraction problem. Solution is:"

 E[X|X+Y] = (X + Y) * sigmaX / (sigmaX + sigmaY)

This looked about right so I thanked him and figured I work it out at home. It appears I'm a little rusty here, though. I can give verbal reasoning why this would be true but can't write down the math. What is the mathematical reason this is true?

Thanks all!

CompEcon
  • 305
  • related: https://stats.stackexchange.com/questions/9071/intuitive-explanation-of-contribution-to-sum-of-two-normally-distributed-random – Henry Apr 24 '17 at 06:51

2 Answers2

8

If $X$ and $Y$ are zero-mean independent normal random variables with variances $\sigma_X^2$ and $\sigma_Y^2$ respectively, then $X$ and $X+Y = Z$ are zero-mean jointly normal variables where $\sigma_Z^2 = \sigma_X^2 + \sigma_Y^2$ and $\text{cov}(X, Z) = \text{cov}(X,X+Y) = E[X^2] + E[XY] = \sigma_X^2$ so that the correlation coefficient is $$\rho_{X,Z} = \frac{\sigma_X^2}{\sigma_X\sqrt{\sigma_X^2 + \sigma_Y^2}} = \frac{\sigma_X}{\sqrt{\sigma_X^2 + \sigma_Y^2}}$$. The conditional distribution of $X$ given $Z = z$ is normal, and since the variables involved all have zero means, the conditional mean simplifies to
$$z\cdot \rho_{X,Z}\frac{\sigma_X}{\sigma_Z} = z\cdot \frac{\sigma_X^2}{\sigma_X^2 + \sigma_Y^2}$$

Dilip Sarwate
  • 46,658
  • 1
    I really like this answer! Could you please explain why X and Z are jointly normal or direct me to a source/answer that explains it? I can't seem to find any unfortunately, Thanks in advance! @Dilip Sarwate – Bob Mar 09 '16 at 23:58
  • 1
    It is a standard result that if $X$ and $Y$ are jointly normal random variables, then so are $aX+bY$ and $cX+dY$ jointly normal random variables. You certainly haven't looked very hard for this result: it is there even in the top hit on Google if you search for "jointly normal random variables." – Dilip Sarwate Mar 10 '16 at 02:54
3

It is just like the linear model $Z = X + \epsilon$ (take $\epsilon \equiv Y$), but where you regress $X$ on $Z$ rather than $Z$ on $X$.

The slope of the regression of $X$ on $Z$ is $\text{cor}(X,Z) \sigma_X / \sigma_Z$, and $\sigma^2_Z = \sigma^2_X + \sigma^2_Y$, while $\text{cor}(X,Z) = \text{cov}(X,Z)/[\sigma_X \sigma_Z] = \sigma_X^2/[\sigma_X \sigma_Z] = \sigma_X / \sqrt{\sigma_X^2+\sigma_Y^2}$.

Thus I get the slope to be $\sigma_X^2/(\sigma_X^2+\sigma_Y^2)$

So, the same as you, provided that sigmaX and sigmaY are interpreted as variances.

Karl
  • 6,197