8

For any given covariance matrix, will the sum of the diagonal elements always be bigger than the sum of the off-diagonal elements?

Let $\sigma_i$ be the standard deviation of the $i^\text{th}$ term of a $n\times n$ covariance matrix and $\rho_{ij}$ the correlation between the $i^\text{th}$ and $j^\text{th}$ terms. Is the following statement always true? $$ \sum_{i=1}^n\sigma_i^2 \ge 2\sum_{i<j}\rho_{ij}\sigma_i\sigma_j $$

Richard Hardy
  • 67,272
krenova
  • 175
  • 7
  • 5
    as michael showed, no. There is a somewhat of a converse though: check out the notion of diagonal dominance: https://en.wikipedia.org/wiki/Diagonally_dominant_matrix – John Madden May 06 '23 at 18:01

3 Answers3

16

Consider the general equi-correlation covariance matrix: \begin{align} \Sigma = \begin{bmatrix} 1 & \rho & \cdots & \rho \\ \rho & 1 & \cdots & \rho \\ \vdots & \vdots & \ddots & \vdots \\ \rho & \rho & \cdots & 1 \end{bmatrix} \in \mathbb{R}^{n \times n}. \tag{1} \end{align} The sum of all the diagonal elements is $S_1 = n$, while the sum of all the off-diagonal elements is $S_2 = \rho \times (n^2 - n)$. If you analyze the limiting behavior, for fixed $\rho \in (0, 1]$, the opposite inequality $S_2 > S_1$ always holds for sufficiently large $n$.


Note that $\Sigma$ in $(1)$ is positive semi-definite (PSD) for $\rho \in (0, 1]$. A classical proof of this goes as follows.

It is straightforward to verify that $\Sigma$ can be rewritten as $\Sigma = \rho ee' + (1 - \rho)I_{(n)}$ with $e$ an $n$-long column vector of all ones. As all the eigenvalues of the rank-$1$ matrix $ee'$ are $\{n, 0, \ldots, 0\}$, all the eigenvalues of $\rho ee' + (1 - \rho)I_{(n)}$ are \begin{align} n\rho + (1 - \rho) = 1 + (n - 1)\rho, 1 - \rho, \ldots, 1 - \rho, \end{align} which are all nonnegative provided $\rho \in [-(n - 1)^{-1}, 1]$. This shows that $\Sigma$ is PSD for $\rho \in (0, 1]$, hence a valid covariance matrix.

Zhanxiong
  • 18,524
  • 1
  • 40
  • 73
  • It's very obvious now. Thank you. To make the argument tight, can I say that all that is left to prove is that such a matrix is always semi-positive definite and therefore a valid covariance matrix? Or is there a simpler and more obvious argument? – krenova May 06 '23 at 19:28
7

No. Highly-correlated variables will violate this rule:

x <- seq(0, 1, len = 100)
X <- data.frame(x = x, x2 = x^2, x3 = x^3)
X_cor <- cor(X)
sum(X_cor[col(X_cor) != row(X_cor)])  # 5.73822
sum(diag(X_cor))                      # 3
Michael M
  • 11,815
  • 5
  • 33
  • 50
  • 1
    There's an especially simple example with perfect correlation: let $Z \sim N(0,1)$ and consider the covariance of $(Z,Z,Z)$. – Silverfish May 08 '23 at 20:25
  • @Silverfish, your example will result in an invalid covariance matrix. – krenova May 15 '23 at 04:08
  • @krenova I might be missing something, but why is it invalid to take the covariance of two (or more) perfectly correlated variables? – Silverfish May 15 '23 at 18:23
  • @Silverfish, not that it is invalid, but I felt that it might mislead some with respect to my question. Therefore, I thought to point this out. Although I did not specify, my question required an invertible covariance matrix. Taking the covariance of (Z,Z,Z) leads to a covariance matrix which is not invertible. I might have mistaken your intentions though :) – krenova May 16 '23 at 10:00
3

You have been given two good answers. I thought it might be instructive to come at this from a different angle, and suggest how one might realise themselves that the statement is false, by finding a counterexample.

It can often be useful to run simulations (in say R or Python), to test our understanding of things or in this case look for counter examples. The Python code below took very little time to write (minutes), and gave me a counter example almost immediately.

import numpy as np
for i in range(1000):
    data = np.random.randint(-100,100,(3,3))
    cov = np.cov(data)
    sum_diag = np.diag(cov).sum()
    sum_all_elements = cov.sum()
    sum_off_diag = sum_all_elements - sum_diag
    if sum_off_diag > sum_diag:
        print('Data:' ,data, "\nCovariance matrix: ", cov, "\nSum diag:", sum_diag, "\nSum off diag", sum_off_diag)
        break;

Having a counter example(s) means you can focus your attention in the right place, perhaps if you had a few counter examples you may have then have observed that highly correlated variables seem to violate this, such as pointed out in Michael M's answer.