Can anyone explain about the proportion of variance explained in PCA and why it is important in the analysis of PCA?
-
1https://stats.stackexchange.com/questions/22569/pca-and-proportion-of-variance-explained?rq=1 – David Kozak Jul 21 '17 at 06:06
1 Answers
For multivariate observations of potentially correlated data in say $n$ dimensions, the principal components provide orthogonal variables up to $n$.
The first principal component is in the direction of the largest spread or variance. Some of the variance in the $n$ components is the total variance. The proportion of variance explained by the first $r$ principal components provides the most variance for any r components. The percentage of variance explained by the first r principal components is just the total variance in the first r principal components divided by the total variance in all n principal components. This is important because a small number of principal components could explain a large portion of the total variance (say 80%) and so this can allow for dimensionality reduction to these $r$ principal components, which are linear combinations of the original variables.
This is also explained in a number of questions posed on this site including the one linked by David Kozak.
- 11,726
- 42,857