2

I know that if we have 1-dimensional Gaussian r.v. X and Y we can find coefficients $a,b$ so that $$X=aY+bZ+E[X-Y]$$ where $Y , Z$ are independent and $Z$ is standard Gaussian.

Can we do something similar if X and Y are Gaussian vectors in $\mathbb{R}^n$ (for simplicity assume their covariance matrices are non-singular)?

Assuming for simplicity that X and Y are centered, I tried finding some matrix A so that $$E[(X-AY)Y^{T}]=0$$, but I am not sure that it guarantees the independence of $X-AY$ and $Y$.

Ilya
  • 21
  • 1
    You can do more than that: $X$ and $Y$ do not even have to have the same dimensions. Use the technique I describe at the end of the post at https://stats.stackexchange.com/a/313138/919. In fact, a simple rephrasing of your question exposes the idea: given any $n\times m$ matrix $A$ and a zero-mean multivariate Gaussian $(X,Y)$ ($X$ of length $n,$ $Y$ of length $m$), find an $n\times n$ matrix $B$ for which $X-AY$ and $BZ$ have the same covariance matrices where $Z$ is $n$-variate standard Gaussian. (Dealing with nonzero means is trivial.) It takes one line if you use the right definitions! – whuber Jun 29 '23 at 16:45
  • @whuber Thank you for your answer! I am still not quite sure how to guarantee the independence of $Z$ and $Y$. Since any multivariate Gaussian is of the form $BZ$ and $X-AY$ is Gaussian, this part is clear. But how do you find A so that the independence condition holds? – Ilya Jun 29 '23 at 17:22
  • Sorry -- I overlooked the requirement of independence between $Y$ and $Z.$ That will determine $A$ and also imposes some constraints on $Y.$ – whuber Jun 29 '23 at 17:56

1 Answers1

5

Let $X$ be $n$-dimensional, $Y$ be $m$-dimensional, and both with zero means. (It's simple to deal with nonzero means later because they just get added in.) We seek an $n$-dimensional zero-mean random vector $Z$ with unit covariance matrix and uncorrelated with $Y$ (that is, $E[ZY^\prime]=0$), an $n\times m$ matrix $A,$ and an $n\times n$ matrix $B$ for which

$$X = AY + BZ.$$

Rewriting this as

$$X - AY = BZ,$$

right-multiplying by the row vector $Y^\prime$ and taking expectations gives

$$\operatorname{Cov}(X,Y) - A\operatorname{Cov}(Y) = E[XY^\prime] - AE[YY^\prime] = BE[ZY^\prime] = 0.\tag{*}$$

When $\operatorname{Cov}(Y)$ is nonsingular this has a unique solution

$$A = \operatorname{Cov}(X,Y)\operatorname{Cov}(Y)^{-1}.$$

Take covariances of the $n$-vectors in the rewritten equation to find

$$\operatorname{Cov}(X) - A\operatorname{Cov}(X,Y) - \operatorname{Cov}(Y,X)A^\prime + A\operatorname{Cov}(Y)A^\prime = BB^\prime.$$

This exhibits $B$ as a(ny) matrix square root of the (symmetric, positive semidefinite) matrix on the left hand side.

This solution assumed nothing about the actual distribution of $(X,Y,Z);$ but when it is multivariate Normal, the vanishing covariance of $Z$ with $Y$ assures $Z$ and $Y$ are also independent and because $AY+BZ$ is $n$-variate Normal and has the same second moments as $X,$ it gives the intended distribution.

When the covariance of $Y$ is singular, there might still be a solution. In such circumstances you will have to examine $(*)$ more closely, for instance by row-reducing the $m\times(n+m)$ augmented matrix $[\operatorname{Cov}(Y) \mid \operatorname{Cov}(Y,X)].$

whuber
  • 322,774