I have read the responses to this question here, here and here, but I am still confused on the application of loadings and eigenvectors.
Principally this statement (from the first link),
It is loadings which "restore" the original covariance/correlation matrix (see also this thread discussing nuances of PCA and FA in that respect);
I am performing PCA analysis on a historical time series of interest rates. The steps I followed are:
- Given a data series $\bf{X}$, compute a centered data series $\bf{X'}$.
- Compute a correlation matrix on the centered data $\bf{C}$.
- Perform a decomposition of the correlation matrix to determine eigenvectors $\bf{v}$ and eigenvalues $\bf{\lambda}$.
- The principal components (also known as PC scores?) are then calculated as $\bf{P=X' \cdot v}$. The first three components have high explanatory power (explaining > 95% of the variance).
- The first three eigenvectors $\bf{v'}$ and components $\bf{P'}$ are retained and the reconstructed data $\bf{R}$ is computed as: $$ \bf{R = [P' \cdot v'] \cdot \sigma(X) + \mu(X)} $$
The reconstructed data from step 5 is exactly as I would have expected it.
I have not used the loadings $\bf{L}$ (defined as $\bf{L} = \bf{v} \cdot \sqrt{\bf{\lambda}}$). So I'm still struggling to understand their significance given I am able to reduce and reproduce the original data.
Observations on the calculations,
- The eigenvectors $\bf{v}$ are unit scaled ($\bf{\Sigma v^2 = 1}$).
- The mean of the principal components $\bf{\mu(P) = 0}$.
- The standard deviation of the principal components $\bf{\sigma(P) = \sqrt{\lambda}}$.
So perhaps I am getting confused because of this last point, namely the standard deviation of the principal components is not equal to $1.0$, but is rather the square root of the eigenvalue.
But it's not clear to me how one would get a centered principal component. I'd appreciate if someone could explain where one would use loadings instead of eigenvectors.