LDA, PCA and k-means: how are they related?

Question

I am trying to understand how linear discriminant analysis (LDA) is related to principal component analysis (PCA) and k-means clustering method. As an example, here is a comparison between PCA and k-means:

enter image description here

My question is how LDA is related to PCA and k-means?

Where is the snapshot taken from? By the way, it does not look much of a comparison between k-means and PCA to me; it shows two different formulas, yes, but how do they compare?.. — amoeba, Feb 06 '15 at 17:55
See also: http://stats.stackexchange.com/questions/23353/pca-lda-cca-and-pls, http://stats.stackexchange.com/a/87509/4598 — cbeleites unhappy with SX, Feb 07 '15 at 14:41

score 1 · Answer 1 · answered Feb 07 '15 at 01:20

I'm by no means an expert in the topic, but it seems that K-means clustering can be viewed as a dimensionality reduction technique, of which LDA and PCA are direct examples. Clustering via K-means seems to uncover the latent structure of data, which essentially results in dimensionality reduction. I'm sure that other people will provide some more advanced answers to this question.

Additionally, I would like to share two references that are relevant to the question/topic and IMHO are rather comprehensive. One reference is a highly-cited research paper by Ding and He (2004) on the relationship between K-means and PCA techniques. Another reference is a research paper by Martinez and Kak (2001), presenting the comparison between PCA and LDA techniques.

References

Ding, C., & He, X. (2004, July). K-means clustering via principal component analysis. In Proceedings of the twenty-first International Conference on Machine Learning (p. 29). ACM.

Martínez, A. M., & Kak, A. C. (2001). PCA versus LDA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 228-233.

See stackoverflow.com/a/29731291/2056067 for an attempt to summarize Ding & He. :-) — A. Donda, May 30 '15 at 16:25
@A.Donda: Thank you for the link. :-) Both answers are very nice (+1) and I will re-read them, when I'll have a bit more time. However, I think that particular question belongs to either Cross Validated, or Data Science SE site and, therefore, should be migrated. — Aleksandr Blekh, May 30 '15 at 18:29
Maybe so, the line is not very clear for programming-related statistics questions (or vice versa). You can flag the question for moderator attention. DS SE is still in beta though. :) — A. Donda, May 30 '15 at 23:44

score 0 · Answer 2 · answered Jul 07 '22 at 13:43

LDA & PCA are used just in different circumstances.

Clustering & PCA are unsupervised. LDA is supervised. - here is good explanation of differencies b/w unsupervised & supervised methods.

The main aim of Clustering, as part of EDA, is grouping... or even if outliers are forming Gaussian mixture => Novelty appearance in ds (e.g. compared with another ds). Then you can calc. responces to these generalized groups. But either such grouping will lead to dimensionality reduction or not depends on data themselves. But whether such grouping will lead to dim_reduction or not - depends on case/chance (on data).

PCA (also part of EDA) really aims for dimensionality reduction if you have tooo great quantity of features, extracting Principal Components that input into variety in ds. Also used for outliers removal in pre-modelling stage (as is before modelling dependencies). N.B. Non-Gaussian distributed data causes PCA to fail.

LDA, PCA and k-means: how are they related?

2 Answers2