4

From what I understood when we are doing PCA, we can work both with raw or standardised data, depending on the situation we're in. Is it true that the average of the eigenvalues is equal to 1 when we are working with standardised data? If yes, why?

utobi
  • 11,726
  • 2
    The trace of a matrix is invariant under rotation (that is, conjugation by an orthogonal matrix), QED. – whuber Mar 24 '23 at 19:35

2 Answers2

6

Definitely yes. To see this let's consider the population case. Suppose $X$ is a $p\times 1$ random vector with mean $\mu$ and covariance matrix $\Sigma$ and consider $Y = X\text{diag}(\Sigma)^{-1/2}$. If $\Sigma_Y$ is the covariance matrix of $Y$, then this has values 1 in its main diagonal.

Now consider the PCA applied to $Y$, i.e. consider the eigen decomposition $\Sigma_y = \Gamma\Lambda \Gamma^\top$, where $\Lambda = \text{diag}(\lambda_1,\ldots,\lambda_p)$. The principal components of $Y$ are the components of the vector

$$ Z = Y\Gamma. $$

You can easily check that $\text{cov}(Z) = \Lambda$ and thus

$$p^{-1}\text{trace}(\Sigma_Y) =p^{-1}\text{trace}(\Lambda) = p^{-1} \sum_{i=1}^p \lambda_i= 1.$$

Of course, a similar property holds also for the sample PCA.

utobi
  • 11,726
2

Not quite.

The sum of the eigenvalues of any PCA performed on a $P$ by $P$ correlation matrix ($\textbf{R}$), where $P$ is the number of variables, exactly equals $P$. This is why such eigenvalues are typically interpreted as apportioning the total variance in the data among $P$ different components. In this case the sum of the eigenvalues averaged over $P$ exactly equals 1.

Sometimes, however, a PCA is performed on the variance/covariance matrix ($\mathbf{\Sigma}$) instead. This is typically done when $\mathbf{\Sigma} \approx \textbf{R}$, so you get roughly similar interpretations if the approximation is close, with a similar value of eigenvalues averaged over $P$. $\mathbf{\Sigma} = \textbf{R}$ exactly when the data are standardized. Otherwise PCA on $\mathbf{\Sigma}$ can produce eigenvalues that do not sum to $P$, and thus the sum of eigenvalues averaged over $P$ will not equal 1.

Properly speaking, PCA is not performed on the data but on either $\mathbf{\Sigma}$ or $\textbf{R}$.

Alexis
  • 29,850
  • 3
    Alexis if as you say PCA is performed on a PxP matrix correlation matrix where P is the number of variables and is also the sum of the eigenvalues, isn't the same of saying that the average of the evalues is 1? – Paolo Totaro Mar 24 '23 at 18:54
  • 2
    I don't fully agree with your answer. PCA is performed on the data, as utobi worte, PCA is just a rotation of the data, i.e. $X\Gamma$. – yahiro Mar 24 '23 at 19:41
  • @yahiro We disagree then: Can you show me results from a PCA which disagree with the eigendecomposition of $\textbf{R}$ or $\mathbf{\Sigma}$? – Alexis Mar 24 '23 at 19:44
  • @PaoloTotaro I am pretty sure I said that in my answer: "In this case the sum of the eigenvalues averaged over exactly equals 1." The "not quite" portion of my answer was specific to the points I was making about PCA with the variance/covariance matrix. – Alexis Mar 24 '23 at 19:47
  • 2
    well, maybe it's just a mater of points of view. The way I see PCA is just as a (singular value) matrix decomposition of the data matrix X, which does not require $\Sigma$ or $R$. But clearly, the two approaches are related https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca – yahiro Mar 24 '23 at 19:54
  • @yahiro Genuinely interested if you can provide a counterexample in my last comment. SVD does not appear to me to be the way PCA is implemented in statistical packages (e.g., R's princomp, Stata's pca, etc.)… these do not even give an SVD option. (And thank you for that link! The accepted answer has juicy details.) – Alexis Mar 24 '23 at 20:14
  • 3
    @Alexis, princomp help-page says A preferred method of calculation is to use svd on x, as is done in prcomp. – yahiro Mar 24 '23 at 20:28
  • 1
    At first sight, I thought the Not quite was referring to the OP's claim. But my first sight was wrong +1. – utobi Mar 24 '23 at 20:42
  • @yahiro Ah, good catch! And thanks again. :) Would you agree with something like "eigendecomposition of $\textbf{R}$ is isomorphic to singular value decomposition of the data matrix $\textbf{X}$"? – Alexis Mar 24 '23 at 23:01
  • 2
    @Alexis No, for several reasons. First, that's not what "isomorphic" means -- but we get your point. Second, the eigendecomposition is numerically fraught and breaks down with much smaller problems than SVD. Third, the SVD yields more detailed information. – whuber Mar 24 '23 at 23:08
  • @whuber Can you amplify "more detailed information" in the context of PCA? Like, do you mean there is some SVD version of PCA that yields quantities besides eigenvalues and eigenvectors? – Alexis Mar 25 '23 at 04:36
  • 2
    @Alexis Yes, because SVD yields two orthogonal matrices $U$ and $V$ in the decomposition $X=U,S,V^\prime$ whereas the eigendecomposition of $X^\prime X$ produces only $S^2$ and $V^\prime:$ nothing about $U$ is retained in computing $X^\prime X.$ – whuber Mar 25 '23 at 13:03