Please let me know if the below statement is valid or not ;
Suppose that $X$ is an $n\times p$ data matrix with $p$ features and $n$ data samples. Suppose further that each feature(column) is zero centered so that the average of each column is zero. Then, the covariance matrix for $X$ is $\frac1nX^TX$.
It's such a simple and easy-looking question, but I had difficulty in finding the reference of the very problem. My proof including the relevant definitions are follows.
Let $$ X= \begin{bmatrix} x_{11} &\cdots&x_{1p}\\ \vdots &\ddots&\vdots\\ x_{n1} &\cdots&x_{np}\\ \end{bmatrix} $$
Let the random variable $X_j$ follow the uniform distribution among the entries of the $j$th column for $j=1,2,\cdots,p$. Then, $\mathbb E[X_j]=\frac1n\sum_{i=1}^nx_{ij}=0$ and the covariance $s_{jk}$ of $X_j$ and $X_k$ is \begin{align*} s_{jk} &=\mathbb E[(X_j-0)(X_k-0)]\\ &=\mathbb E[X_jX_k]\\ &=\frac1n\sum_{i=1}^nx_{ij}x_{ik}. \end{align*}
Now, denote the covariance matrix of $X$ by $S=[s_{jk}]_{p\times p}$. Then
\begin{align*} S &=[s_{jk}]_{p\times p}\\ &=\left[\frac1n\sum_{i=1}^nx_{ij}x_{ik}\right]_{p\times p}\\ &=\frac1n \begin{bmatrix} \sum_{i=1}^nx_{i1}x_{i1}&\cdots &\sum_{i=1}^nx_{i1}x_{ip}\\ \vdots &\ddots &\vdots\\ \sum_{i=1}^nx_{ip}x_{i1}&\cdots &\sum_{i=1}^nx_{ip}x_{ip}\\ \end{bmatrix}\\ &=\frac1nX^TX \end{align*}