8

Continuing this question, I want to ask if its possible to have a multivariate random distribution having simultaneously all its random variables reverse correlated. I think it is impossible, because when a pair of distribution is reverse correlated the third distribution will be correlated with one of the two and reverse correlated with the other. So there will be a pair with a positive correlation. This question stems from a university project, where the professor asks to create randomly in the n-dim grid anti-correlated data. This is impossible if the n is >2.

  • 1
    Welcome to Cross Validated! I see an easy bivariate example. Do you require at least three margins? – Dave Nov 16 '23 at 21:45
  • 5
    What's wrong with $$\mathcal{N} \left( \vec 0, \begin{bmatrix} 1 & -\frac{1}{10} & -\frac{1}{10} \ -\frac{1}{10} & 1 & -\frac{1}{10} \ -\frac{1}{10} & -\frac{1}{10} & 1 \end{bmatrix} \right)?$$ – Galen Nov 16 '23 at 21:49
  • import numpy as np; np.random.multivariate_normal([0,0,0], [[1,-0.1, -0.1],[-0.1,1,-0.1],[-0.1, -0.1, 1]]) – Galen Nov 16 '23 at 21:50
  • @Galen I see that you have specified a 3×3 variance-covariance matrix with negative values on the covariance terms, but that does not clearly answer tasos_koukos' question whether if $\text{cor}(x_1,x_2)<0$, and $\text{cor}(x_1,x_3)<0$, $\text{cor}(x_2,x_3)>0$? – Alexis Nov 16 '23 at 21:56
  • 1
    @Alexis Shouldn't all of those be "less than"? – Dave Nov 16 '23 at 21:58
  • @Dave I was trying to speak directly to tasos_koukos' thought that "when a pair of distribution is reverse correlated the third distribution will be correlated with one of the two and reverse correlated with the other." – Alexis Nov 16 '23 at 22:10
  • @Alexis I follow it now. Thanks! – Dave Nov 16 '23 at 22:11
  • 4
    @Alexis how is Galen's comment not right? Doesn't his example show a counter example? – Sextus Empiricus Nov 16 '23 at 22:13
  • 1
    @SextusEmpiricus It presents a mathematical artifact which does not sufficiently explain, not to me anyway. The 'rightness' of Galen's math was not in question. See also my comment to Dave. – Alexis Nov 16 '23 at 22:17
  • 3
    To support @Galen's counterexample, in this answer, it has been shown for any dimension $n$, such an equi-correlation matrix is a valid covariance matrix for all $\rho > -(n - 1)^{-1}$, pick up any negative value in this range. – Zhanxiong Nov 16 '23 at 22:17
  • 2
    I answered this question at https://stats.stackexchange.com/a/72795/919 by providing an example of multivariate distributions (of any dimension) with negative correlations among all pairs of variables. – whuber Nov 17 '23 at 02:16
  • 2
    Just to add to the excellent answers below: we must have $\rho\ge-\tfrac{1}{n-1}$, see eg https://statisticaloddsandends.wordpress.com/2022/09/23/lower-bound-of-the-correlation-parameter-for-an-equicorrelation-matrix/ – Daniel Robert-Nicoud Nov 17 '23 at 11:54
  • @DanielRobert-Nicoud That is precisely what I demonstrated in the thread I linked to. But I went beyond that by exhibiting distributions that attain this lower bound -- a logical step that is often missing. – whuber Nov 17 '23 at 15:07

4 Answers4

14

If $X \mathrel{:=} \left(X_1, \ldots, X_k\right)^\top \sim \mathop{\mathrm{Multinomial}}\left(n, \left(p_1, \ldots, p_k\right)^\top\right)$ with $n \in \mathbb N_{\geq 1}, k \in \mathbb N_{\geq 2},$ we have $\mathrm{Cov}(X_i, X_j) = - np_ip_j$ for all $i,j \in \{1, \ldots , k\} :i \neq j$.

Thus, if $p_i \in (0,1)$ for all $i \in \{1, \ldots , k\}$, then all components of $X$ are negatively correlated.
This is not particularly surprising as an increase in the value of one component of $X$ must result in a decrease in the value of another component for fixed $n$.

statmerkur
  • 5,950
  • 3
    This is a natural, concise and classical example! +1 – Zhanxiong Nov 17 '23 at 05:27
  • 1
    Indeed, you can consider the case $n=1$. Let $A_1, A_2, \dots, A_k$ be any collection of disjoint events each with positive probability. Then the indicator functions $I(A_1), \dots, I(A_k)$ are pairwise negatively correlated. – James Martin Nov 17 '23 at 15:41
13

Mathematically, as I commented, you can use the equi-correlation matrix

\begin{align*} \Sigma = \begin{bmatrix} 1 & \rho & \cdots & \rho \\ \rho & 1 & \cdots & \rho \\ \vdots & \vdots & \ddots & \vdots \\ \rho & \rho & \cdots & 1 \end{bmatrix} = (1 - \rho)I_{(n)} + \rho ee^\top \tag{1}\label{1} \end{align*} with $\rho \in (-(n - 1)^{-1}, 0)$ as a valid example. The proof of $\Sigma $ is positive-definite under this condition, hence a valid covariance matrix is contained in this answer. Here $e$ stands for an $n$-long column vector whose entries are all ones.

Statistically, a natural follow-up question is: for what specific random vector $\mathbf{X} = (X_1, X_2, \ldots, X_n)$, its correlation matrix is exactly the $\Sigma$ in $\eqref{1}$? A quick (and good) answer is that draw $\mathbf{X}$ directly from an $n$-dimensional multivariate normal distribution with covariance matrix $\eqref{1}$. This answer tries to endow $X_i$s a more explicit and granular expression.

When $\rho \geq 0$, it is easy to verify that \begin{align*} X_i = \sqrt{\rho}Z + \sqrt{1 - \rho}Y_i, \quad i = 1, 2, \ldots, n, \tag{2}\label{2} \end{align*} where $Z, Y_1, \ldots, Y_n \text{ i.i.d.} \sim N(0, 1)$, satisfies $\operatorname{Corr}(X_i, X_j) = \rho$, $1 \leq i \neq j \leq n$. In mathematical finance, $\eqref{2}$ are often used to model default risk of $n$ companies, where $Z$ is interpreted as systemic factor and $Y_1, \ldots, Y_n$ are idiosyncratic factors.

When $\rho < 0$, it is relatively more difficult to construct equi-correlated $X_i$s. However, some idea can still be learned from the form of $\eqref{2}$ -- we can still set $X_i$ as a linear combination of $Z$ and $Y_i$, but must also impose some common intercorrelation between $Y_i$ and $Z$. Specifically, let $Y_1, \ldots, Y_n \text{ i.i.d. } \sim N(0, 1)$, $Z \sim N(0, 1)$, but $\operatorname{Cov}(Y_1, Z) = \cdots = \operatorname{Cov}(Y_n, Z) = c$ for some constant $c \in (-1, 1)$ to be determined by $\rho$, then set \begin{align*} X_i = Z + Y_i, \quad i = 1, 2, \ldots, n. \tag{3}\label{3} \end{align*} Note that $\eqref{3}$ may be viewed as $n$ draws from a random effect model with $Z$ as main effect and $Y_i$ as error. $c$ can be determined by solving the equation for $1 \leq i \neq j \leq n$: \begin{align*} \rho = \frac{\operatorname{Cov}(X_i, X_j)}{\sqrt{\operatorname{Var}(X_i)\operatorname{Var}(X_j)}} = \frac{1 + 2c}{2 + 2c}, \end{align*} i.e., $c = \frac{2\rho - 1}{2(1 - \rho)}$. It should be noticed that as long as $\rho < \frac{3}{4}$, we can ensure that $|c| < 1$.

In summary, by choosing $Z, Y_1, \ldots, Y_n$ such that $Y_1, \ldots, Y_n \text{ i.i.d. } \sim N(0, 1)$, $Z \sim N(0, 1)$, $\operatorname{Cov}(Y_i, Z) = \frac{2\rho - 1}{2(1 - \rho)}$ and setting $X_i = Y_i + Z$, $i = 1, \ldots, n$, we can render $\operatorname{Corr}(\mathbf{X}) = \Sigma$ in $\eqref{1}$ for any $\rho < \frac{3}{4}$.

Alert readers may be curious now: as the above construction seems imposing no constraint on $\rho$ (except $\rho < \frac{3}{4}$), where does the aforementioned constraint $\rho > -(n - 1)^{-1}$ apply then? Well, this is because the above calculation concerns only with pairwise correlations, whereas in order that an order $n$ symmetric matrix to be a covariance matrix, we must ensure that $\operatorname{Var}(\beta^\top\mathbf{X}) > 0$ for any vector $\beta$, which results in the $\rho > -(n - 1)^{-1}$ constraint. In other words, simply specifying $\eqref{3}$ without imposing any constraint on $\rho$ might lead to some lurking inconsistencies. Therefore, our construction must be placed under the umbrella condition $\rho \in (-(n - 1)^{-1}, 0)$.


Addendum 1

One last legit question is, how to make sure the structure of $Z, Y_1, \ldots, Y_n$ as specified exists?

One obvious construction of $Z, Y_1, \ldots, Y_n$ is: \begin{align*} \begin{bmatrix} Z \\ \mathbf{Y} \end{bmatrix} \sim N_{n + 1}\left(0, \begin{bmatrix} 1 & c e^\top \\ ce & I_{(n)} \end{bmatrix}\right), \end{align*} where $\mathbf{Y} = (Y_1, \ldots, Y_n)^\top$. In order that the matrix $\begin{bmatrix} 1 & c e^\top \\ ce & I_{(n)} \end{bmatrix}$ is a (non-degenerate) covariance matrix, a necessary condition is its determinant must be positive, which requires \begin{align*} \det\left(\begin{bmatrix} 1 & c e^\top \\ ce & I_{(n)} \end{bmatrix}\right) = \det(I_{(n)}) \times (1 - c^2e^\top e) = 1 - nc^2 > 0. \end{align*} i.e., $c < \frac{1}{\sqrt{n}}$ (one can show this is also a sufficient condition). Since $c = \frac{2\rho - 1}{2(1 - \rho)}$ in our setting, this in turn requires $\frac{2\rho - 1}{2(1 - \rho)} < \frac{1}{\sqrt{n}}$, i.e., $\rho < \frac{\sqrt{n} + 2}{2\sqrt{n} + 2}$. This, of course, does not conflict with the base constraint $\rho > -(n - 1)^{-1}$.


Addendum 2

Inspired by statmerkur's excellent multinomial example, the Dirichlet distribution provides another classical multivariate distribution which has all negative pairwise correlations.

Zhanxiong
  • 18,524
  • 1
  • 40
  • 73
6

Let X,Y have a negative covariance/correlation, and define a third variable as a linear sum of those two plus independent noise $\epsilon$

$$Z = -X -aY + \epsilon$$

Now, this will be inverse correlated with both of them unless the negative covariance between $X$ and $Y$ cancels one of the negative terms.

$$Cov(Z,X) = Cov(-X-aY, X) = - Var(X) - a Cov(Y,X)$$ $$Cov(Z,Y) = Cov(-X-aY, Y) = - a Var(Y) - Cov(Y,X)$$

The first terms on the right-hand side of the equations in the linear combinations, $-Var(X)$ and $-a Var(Y)$, are negative.

It would be no surprise when a variable like $Z$ is negatively correlated with $X$ and $Y$ and when $Cov(Y,X)$ is small then the linear combinations can be easily seen as negative without much surprise.

Only when $-Cov(Y,X)$ is large then it might be possible that $Z$ has a positive correlation with $X$ and/or $Y$

Such positive correlation happens when

$$ -a Cov(Y,X)>Var(X) $$ $$ -Cov(Y,X) > aVar(Y) $$

or in terms of the correlation $\rho^2 = Cov(Y,X)^2/Var(X)Var(Y)$

$$-\rho > \frac{1}{a} \frac{Var(X)}{Var(Y)}$$ $$-\rho > a \frac{Var(Y)}{Var(X)}$$


Code example

set.seed(1) 
n = 100

X = rnorm(n) Y = -0.5 * X + rnorm(n) Z = -X - 0.5*Y + rnorm(n)

M = cbind(X,Y,Z) cov(M)

gives

           X          Y          Z
X  0.8067621 -0.4042365 -0.5875672
Y -0.4042365  1.1200783 -0.2134165
Z -0.5875672 -0.2134165  1.7757109
3

YES

(I find this fact surprising, too.)

library(MASS)
set.seed(2023)
R <- 1000
r1 <- runif(R, -1, 0)
r2 <- runif(R, -1, 0)
r3 <- runif(R, -1, 0)
signs <- rep(NA, R)
for (i in 1:R){
  S <- matrix(
    c(
      1, r1[i], r3[i],
      r1[i], 1, r2[i],
      r3[i], r2[i], 1
    ), 3, 3
  )
  signs[i] <- sign(min(eigen(S)$values))
  # print(i)
}
i <- which(signs == 1)[1]
S <- matrix(
  c(
    1, r1[i], r3[i],
    r1[i], 1, r2[i],
    r3[i], r2[i], 1
  ), 3, 3
)
X <- MASS::mvrnorm(1000, rep(0, 3), S)
cor(X)

Here, I loop many times until I find a candidate correlation matrix with off-diagonal elements less than zero that has a smallest eigenvalue greater than zero, meaning that this is a valid correlation matrix with all margins negatively correlated with each other. I also simulate from a multivariate normal distribution with such a covariance matrix and show the empirical correlation values to be less than zero.

(This can be seen as a linear algebra problem asking if a symmetric matrix can have positive diagonal elements, negative diagonal elements, and positive eigenvalues. If you can prove that to exist (the above code gives such a matrix), then you get the desired multivariate distribution as a multivariate normal with that matrix as the covariance matrix.)

Dave
  • 62,186
  • 2
    +1 *Very cool!* I share your surprise. :) – Alexis Nov 16 '23 at 21:58
  • 5
    I don't see the surprise and why you need to do these simulations. Isn't the comment by Galen sufficient. Or else just do X = rnorm(n);Y = -0.5 * X + rnorm(n);Z = -X - 0.5*Y + rnorm(n) – Sextus Empiricus Nov 16 '23 at 22:13
  • 1
    A few moments contemplating a regular tetrahedron should remove all sense of surprise, because all examples of this phenomenon concern vectors that approximate the vertices of a tetrahedron (in any finite number of dimensions). For two variables that "tetrahedron" is a zero-centered line segment in $\mathbb R^1;$ for three variables it is a zero-centered equilateral triangle in $\mathbb R^2;$ for four variables it is the usual (zero-centered) Platonic solid in $\mathbb R^3;$ and so on. We're concerned with the angles made between the rays through the vertices: something you can literally see – whuber Nov 21 '23 at 17:18