3

Let's say I have three vectors (or random variables) $A, B,$ and $C.$ I can of course calculate the correlation between any of them (and have these numbers). However what I'm interested in is if there's a bound on the product of the correlations, $r_{A,C}$ $\cdot$ $r_{B,C}$, given $r_{A,B}$.

For example, let's say the correlation between $A$ and $B$ is 0.5. It's obvious that one couldn't get $r_{,}\,r_{,}=1$, since $C$ cannot be perfectly correlated with both B and C. Does anyone know how to calculate this? I have found similar questions on StackExchange but nothing exactly like this.

whuber
  • 322,774
user2551700
  • 95
  • 1
  • 9
  • Hi: go to dsp.stackexchange and do a search for sylvester's criterion. – mlofton Feb 17 '22 at 00:21
  • For more about this subject, search our site. – whuber Feb 17 '22 at 00:24
  • Hello, I do not see any questions that match exactly this one. I see one that's a different but related question pertaining to the case when all correlations are equal. This is not my question, my question is what are the bounds of the product of the correlations. – user2551700 Feb 17 '22 at 17:20
  • Hi: This is the dsp.stackexchange link that I was trying to direct you to. . https://dsp.stackexchange.com/questions/81057/is-there-relationship-between-autocorrelations-rx2-and-rx1/81064#81064 – mlofton Feb 17 '22 at 20:08
  • I see you are asking about the product of two of the correlations, which indeed is a slightly different question, so although the answer is readily obtained from the related threads, I have reopened this one. Could you explain, though, what the meaning of this product would be? The related threads are https://stats.stackexchange.com/questions/5747, https://stats.stackexchange.com/questions/72790, https://stats.stackexchange.com/questions/254282, and https://stats.stackexchange.com/questions/305441. – whuber Feb 17 '22 at 20:38
  • 2
    BTW, your title asks a different question than the text. It would be good to fix up that discrepancy. – whuber Feb 17 '22 at 21:43

3 Answers3

2

Because the correlation matrix must be positive semidefinite, then as shown at https://stats.stackexchange.com/a/5753/919 the three correlations $\tau=\rho_{AB},$ $\sigma=\rho_{AC},$ and $\rho=\rho_{BC}$ must satisfy the inequality

$$1 + 2(\rho\sigma)\tau - (\rho^2 + \sigma^2 + \tau^2) \ge 0\tag{*}$$

and, of course, all three values must be in the range $[-1,1].$

The question asks how to maximimize and minimize $\rho\sigma,$ given a value of $\tau$ and the constraint $(*).$ The following solution is elementary. Alternatively, it can be derived using Lagrange multipliers.

Setting $\phi = (\rho+\sigma)/2$ and $\delta = \rho-\phi$ (whence $\sigma-\phi = -\delta$) we find, with easy algebra, that

$$\rho\sigma=\phi^2 - \delta^2 \le \frac{1 - \tau^2 - 4\delta^2}{2(1-\tau)}$$

(provided only that $\tau \ne \pm 1$). Clearly the upper limit for $\rho\sigma$ on the right hand side can be made as large as possible only by setting $\delta=0,$ which means $\rho\sigma$ is maximized when $\delta=0;$ that is, $\rho=\sigma.$ In this case the problem of maximizing $\rho\sigma$ is very simple (because $(*)$ becomes linear in $\phi^2$), with solution

$$\rho\sigma \le \frac{1+\tau}{2}.$$

To minimize $\rho\sigma$ we deduce, in the same manner, that we wish to make $\delta$ as large as possible. This leads to $\phi=0,$ whence $\rho\sigma=-\delta^2.$ Again the problem becomes simple, with solution

$$\rho\sigma \ge \frac{\tau-1}{2}.$$


One way to visualize (and check via brute force) these results is to pick $\tau=\rho_{AB},$ plot the region in the $(\rho,\sigma) = (\rho_{AC}, \rho_{BC})$ plane where $(*)$ holds, and overplot contour lines of the function $f(\rho,\sigma)=\rho\sigma.$ The best bounds for $\rho\sigma$ are the lowest and highest contour levels that intersect the region plot.

Here are three examples for typical values of $\tau=\rho_{AB},$ including the case posited by the question, $\rho_{AB}=0.5,$ at the right. The solid regions denote the loci of solutions of $(*):$ that is, the mathematically possible values of the two other correlation coefficients.

Figure

The contour lines at levels $(\rho_{AB}-1)/2$ and $(\rho_{AB}+1)/2$ are highlighted: you can see how they just barely skim the vertices of the ellipses determined by $(*),$ osculating precisely at the points $\rho_{AC}=\pm\rho_{BC}$ as deduced in this solution.

It should now be clear that although $\tau=\pm 1$ might have caused algebraic difficulties in the derivation, they aren't really special: the same formulas for the bounds will apply.

whuber
  • 322,774
  • Insightful as always. I'm curious how you drew the transparent ellipses in the plots? Did you use polygon? – COOLSerdash Feb 19 '22 at 08:36
  • 1
    @COOLSerdash I was in a hurry, so I used a quick but inefficient method: these are made with image and contour. Here's the entire code: f <- function(x, y, z) 1 + 2*x*y*z - (x^2 + y^2 + z^2); z <- c(-0.9, 0, 0.5); x <- y <- seq(-1, 1, length.out=501); for (z in z) { image(x, y, outer(x, y, f, z=z), breaks=c(-10,0,10), bty="n", col=hsv(c(0,(z+1)/2), c(0, 0.35), c(0.98,.9)), main=bquote(rho[AB]==.(z))); w <- outer(x,y); contour(x, y, w, add=TRUE, col=gray(.5)); contour(x, y, w, add=TRUE, levels=c((1+z)/2, (z-1)/2), lwd=2) } – whuber Feb 19 '22 at 18:57
  • Ah yes, I forgot about image. Thanks for that. I tried to recreate the plots by constructing the ellipses by calculating the eigenvalues of the corresponding matrix of the quadratic form. Your solution is much simpler, however. – COOLSerdash Feb 19 '22 at 21:27
  • 1
    @COOLSerdash My calculation was also brute force in the sense that it evaluated criterion $(*)$ on a fine grid. Thus, to prove it correct requires only testing the function f that implements that criterion but otherwise uses no analysis. That makes it an effective check of the analytical derivations. – whuber Feb 19 '22 at 21:45
1

A geometric perspective makes this surprisingly easy and provides detailed information about how the extreme products are reached.

Abstractly, $A, B,$ and $C$ are Euclidean vectors and their correlations are the cosines of the angles between them or, equivalently, their inner products. We may simplify the question with a few observations:

  1. $C$ can be expressed as the sum of its component in the $AB$ plane and the orthogonal component (this is just regression of $C$ on $A$ and $B$). Because the orthogonal component contributes nothing to the inner products, its presence can only decrease the magnitudes of those products. Thus, we may with no loss of generality assume $C$ lies in the $AB$ plane.

  2. Because only angles are under consideration, normalize all vectors to have unit length. (The possibility that any vector can be zero is excluded from the question because the correlation with it is undefined.)

  3. We may therefore adopt a polar coordinate system in which $A=(1,0)$ and $B$ lies in the upper half plane at, say, angle $\alpha$ with $A.$ Thus, $\rho_{AB} = \cos(\alpha)$ is the correlation between $A$ and $B.$

Figure showing A, B, and C in the plane

This figure depicts unit vectors with arrows. Dotted lines show their projections onto each other. The projections themselves are highlighted in black. The correlations $\rho_{**}$ are signed: in this image, $\rho_{AB}$ is negative (approximately $-0.5$).

As a result, $C$ is a unit vector in the plane having some polar coordinate $\theta$ and its correlations with $A$ and $B$ are the cosines of the angles it makes with them: that is,

$$\rho_{AC}\rho_{BC} = \cos(\theta)\cos(\theta-\alpha) = \frac{1}{2}\left(\cos(2\theta-\alpha) + \cos(\alpha)\right).$$

The part of the right hand side that varies with $\theta$ is a multiple of a cosine wave, which has maxima when its argument is an even multiple of $\pi$ that is, $2n\pi=2\theta-\alpha$ for integral $n;$ and minima at odd multiples of $\pi;$ that is, $(2n+1)\pi=2\theta-\alpha.$ Solving for $\theta$ gives the answer:

$\rho_{AC}\rho_{BC}$ is maximized when $2\theta=\alpha+2n\pi;$ that is, $C$ bisects $AB$ and therefore the maximum is $$\cos(\alpha/2)\cos(-\alpha/2)=\cos^2(\alpha/2) = \frac{1+\cos(\alpha)}{2} = \frac{1 + \rho_{AB}}{2}.$$ $\rho_{AC}\rho_{BC}$ is minimized when $2\theta=\alpha+(2n+1)\pi,$ with a minimum $$\cos(\pi/2+\alpha/2)\cos(\pi/2-\alpha/2) = -\sin^2(\alpha/2) = \cos^2(\alpha/2)-1 = \frac{-1 + \rho_{AB}}{2}.$$

Furthermore, observation $(1)$ shows that when this product can attain any value, it can also attain any fraction of that value down to zero simply by pulling $C$ sufficiently far out of the $AB$ plane. Thus, every number in the unit-length interval $[(-1+\rho_{AB})/2, (1+\rho_{AB})/2]$ is a possible value of the product $\rho_{AC}\rho_{BC}.$

whuber
  • 322,774
  • Thank you for your great comment but I have one clarifying question. I've heard many times that "correlations are cosines between the vectors", but specifically aren't correlations not cosines? Is that not cosine similarity, which is similar but the denominator is the length of the vector instead of the standard deviation? – user2551700 Mar 02 '22 at 01:02
  • @user2551700 Correlations are cosines. One definition of the cosine is identical to a standard definition of the correlation. In your cosine similarity formula, the vectors merely have been standardized to unit length rather than unit SD, but that factor cancels in numerator and denominator. – whuber Mar 02 '22 at 14:57
0

It may be relevant to consider that $r^2_{AB} + r^2_{BC} > 1 \implies r^2_{AC} > 0$, and with Young's inequality gives us $2 r_{AB} r_{BC} \leq r^2_{AB} + r^2_{BC}$.

Thus, if $2 r_{AB} r_{BC} > 1$, then $r^2_{AB} + r^2_{BC} > 1$. So $2 r^2_{AB} r^2_{BC} > 1 \implies r^2_{AC} > 0$.

If you have $r_{AC} = 0$, and immediately $r^2_{AC} = 0$, then you know that $\lnot (2 r^2_{AB} r^2_{BC} > 1)$. This is equivalent to $2 r^2_{AB} r^2_{BC} \leq 1$, which is converted to $-\frac{1}{\sqrt{2}} \leq r_{AB} r_{BC} \leq \frac{1}{\sqrt{2}}$.

Result $$r_{AC} = 0 \implies -\frac{1}{\sqrt{2}} \leq r_{AB} r_{BC} \leq \frac{1}{\sqrt{2}}$$

This might be a good tool for certain theoretical considerations, but in estimating correlations from data it is very difficult to distinguish if a correlation is small from it actually being zero.

Note: In my own work I tend to use the additive identity that I started with above. When $r^2_{AB} + r^2_{BC} > 1$ I often say and write that the correlation between $A$ and $C$ is 'vouched for', and I often make it part of my analysis of correlation networks to determine which correlations are vouched for in this sense.

Galen
  • 8,442
  • This is a very special case of the question that was asked. Your general solution, $r^2_{AC}\gt 0,$ is essentially useless, because it is always the case that $r^2$ is non-negative. – whuber Feb 18 '22 at 18:53