A long time i'm using PCA for exploratory data analysis and i was sure that it is Ok if the first principal components explain a high (90% and even higher) percentage of data variance. Recently i've found information that it's not good when the first few principal components explain a such high percentage of variance and it may be an analysis artefact because of dominance of the several variables.
Could you clarify for me please which percent of explained variance by the first principal components (e.g. 2-3 ones) is apropriate in PCA analysis and which percent may indicate the presence of dominant variables in the data?
Practical Statistics for Data Scientistswritten byPeter Bruce and Andrew Bruce. And i was wondering if my initial understanding about percentage of explainedvariancewas wrong? – Denis Feb 25 '21 at 21:48loadings, which in turn could introduce bias inPCAanalysis. In the mentioned above book as a potential way to avoid such situations was suggested to contstructScreeplotand carefully eximine a percentage of explainedvariance. I hope now it would be clear, what i'm trying to clarify. – Denis Feb 26 '21 at 10:24