How does "silhouette coefficient" can find the optimal number of clusters when the dataset is not labelled? Does it need a labelled dataset or is it pure statistics?
I mean, when doing unsupervised learning, sometimes, it is used a labelled dataset to verify that the model is doing well. So, I would like to know if "silhouette coefficient" works by using a labelled dataset (even though we do not give it to the algorithm).
Thank you