I was asked to prove why having a finite amount of site to cluster assignments eventually leads to convergence.
In the Lloyd version of K-means, we minimize the distortion measure at every iteration until convergence. Graphically, I understand this as having the centroids and sites more compact until no new cluster memberships are reassigned. I understand that K-means achieves local minima via an EM approach on cluster centroids. I fail to see how the amount of site to cluster assignments ensure convergence.
$r_{nk} = 1$ when the $n^{th}$ site is in the $k^{th}$ cluster, 0 otherwise.