0

Suppose $X$ and $Y$ are two $a \times b$ matrices, randomly sampled from the same normal distribution. I found an interesting phenomenon:

  • If we sum $X X^T$ multiple times, each time $X$ is randomly sampled, the result $S$ will look like an identity matrix:

Pseudo code

S = 0 # a zero matrix
for iter in range(1000):
  X = np.random.normal(loc=0.0, scale=1.0, size=[10, 5])
  A = X @ X.T
  S += A

see image

  • If we sum $X Y^T$ multiple times, each time $X$ and $Y$ are randomly sampled, the result $S$ will not look like an identity matrix:

Pseudo code

S = 0 # a zero matrix
for iter in range(1000):
  X = np.random.normal(loc=0.0, scale=1.0, size=[10, 5])
  Y = np.random.normal(loc=0.0, scale=1.0, size=[10, 5])
  A = X @ Y.T
  S += A

see image

1 Answers1

1

I am assuming you are asking for an explanation of this phenomenon. It follows from a Law of Large Numbers. Note that for the $\ell$-th draw, $$ \left(XX^T\right)_{ij}^{(\ell)} = \sum_{k = 1}^b x_{ik}^{(\ell)}x_{jk}^{(\ell)}, $$ and thus $E\left[\left(XX^T\right)_{ij}^{(\ell)}\right] = \sum_{k = 1}^b E\left[x_{ik}^{(\ell)}x_{jk}^{(\ell)}\right]$. Using the (Weak) Law of Large numbers, we have $$ \frac{1}{n}\sum_{\ell = 1}^n \left(XX^T\right)_{ij}^{(\ell)} \overset{p}{\to} \sum_{k = 1}^b E\left[x_{ik}^{(\ell)}x_{jk}^{(\ell)}\right]. $$ By your sampling, $E\left[x_{ik}^{(\ell)}x_{jk}^{(\ell)}\right] = \sigma^2$ if $i = j$ and $0$ otherwise. Thus, the (weighted) matrix sum will converge to $b\sigma^2 I$. This explains why in your simulations the diagonal elements seem to be close to $5$ (since your $b = 5$ and $\sigma^2 = 1$).

From the above, it should be obvious why the latter case does not seem to converge to a scaled identity matrix. In fact, it converges to the zero matrix for the same reason the off-diagonal elements of the weighted sum of $\left(XX^T\right)^{(\ell)}$ converge to zero. I hope this helps!