0

There are different methods which are proposed for initialization of K means, but is there any literature that lists the merits and demerits of each one.(some sort of survey)

Most popular one is i guess k-means++ but why is it better than farthest point method ?

  • author himself has said this "I'm not discussing presently which method is "better" and in what circumstance" which is exactly what i am looking for and specifically in the case of farthest point vs k means ++ – Siddharth Shakya Jun 06 '18 at 17:20

1 Answers1

1

I believe I have seen such surveys.

K-means++ is better than farthest points if you want to do more than a single run. Farthest points tends to produce almost the same initial conditions every time. K-means++ is well randomized, so if you run it 10 times, you have a better chance of getting a good result at least once.

Among the best initializations are those that sample a subset and cluster the subset, such as Bradley and Fayyad.