Reference for the sum and difference of highly correlated variables being almost uncorrelated

Question

In a paper I've written I model the random variables $X+Y$ and $X-Y$ rather than $X$ and $Y$ to effectively remove the problems that arise when $X$ and $Y$ are highly correlated and have equal variance (as they are in my application). The referees want me to give a reference. I could easily prove it, but being an application journal they prefer a reference to a simple mathematical derivation.

Does anyone have any suggestions for a suitable reference? I thought there was something in Tukey's EDA book (1977) on sums and differences but I can't find it.

Wikipedia has a reference to a textbook at http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient#Removing_correlation ; not sure that helps... — shabbychef, Jul 27 '11 at 03:42
And the prove indeed is more than trivial with equal variances :( $Cov(X+Y,X-Y) = E((X-\mu_X)+(Y-\mu_Y))((X-\mu_X)-(Y-\mu_Y)) = Var X - Var Y = 0$... Good luck, Rob. — Dmitrij Celov, Jul 27 '11 at 10:25
May be indeed to consider the 2 variable case as the separate case of multivariate rotation? — Dmitrij Celov, Jul 27 '11 at 10:37
Tukey doesn't prove anything in EDA: he proceeds by example. For an example of looking at $y+x$ versus $y-x$ see Exhibit 3 of chapter 14, p. 473 (the discussion begins on p. 470). — whuber, Jul 27 '11 at 13:32
One alternative way to get around having to provide a reference. You can consider it a case of modelling the principal components of your data $X,Y$, rather than the individual variables themselves. That would be an easy thing to provide a reference for — probabilityislogic, Jul 29 '11 at 23:13
Closely related: What is the intuition behind the independence of $X_2 - X_1$ and $X_2 + X_1$, $X_i\sim\mathcal N(0,1)$? — gung - Reinstate Monica, Aug 21 '14 at 15:36

Karl · Answer 1 · 2011-08-21T02:39:29.123

3

I would refer to Seber GAF (1977) Linear regression analysis. Wiley, New York. Theorem 1.4.

This says $\text{cov}(AX, BY) = A \text{cov}(X,Y) B'$.

Take $A$ = (1 1) and $B$ = (1 -1) and $X$ = $Y$ = vector with your X and Y.

Note that, to have $\text{cov}(X+Y, X-Y) \approx 0$, it's critical that X and Y have the similar variances. If $\text{var}(X) \gg \text{var}(Y)$, $\text{cov}(X+Y, X-Y)$ will be large.

edited Aug 21 '11 at 02:39

answered Aug 21 '11 at 02:32

Karl

6,197

1

For $W$ and $Z$ to be uncorrelated (or nearly uncorrelated), we don't need $\operatorname{cov}(W,Z)$ to be $0$ or nearly $0$: we need the Pearson correlation coefficient $\rho_{W,Z}$ to be $0$ or nearly $0$. – Dilip Sarwate Aug 22 '14 at 15:52

Reference for the sum and difference of highly correlated variables being almost uncorrelated

1 Answers1

Linked