0

(**Edited the question after the initial comments)

Suppose,

Ground_truth_data = [1, 1, 1, 1, 1, 1, 1];

Clustering_result = [1, 1, 1, 1, 1, 1, 2];

Here, as you can see, there are "7" instances of data and two classes with labels "1" and "2".

Q1) Looking at this result, what can be said about the performance of my clustering algorithm that produces the "Clustering_result"?

Here's what 'two' of the widely used (have seen in many reseach papers) external clustering measures says (https://scikit-learn.org/stable/modules/clustering.html#clustering-performance-evaluation):

ARI = metrics.adjusted_rand_score(Ground_truth_data, Clustering_result)
ARI = 0.0 #Adjusted Rand Index Result

AMI = metrics.adjusted_mutual_info_score(Ground_truth_data, Clustering_result)
AMI = -1.4018874593092454e-15 #Adjusted Normalized Mutual Information

Q2) Does these results mean my clustering algorithm is performing badly? According to me, NO!! Since out of "7" instances "6" are clustered correctly. (Please correct me if I am wrong)

Q3) Why ARI, NMI result in values close to ZERO in the above case?

Q4) In my dataset of 300 different types of data, the maximum value possible for the no. of clusters is "8". Also, in this dataset, there are many instances where a situation similar to the above example have been noticed. In such a case, what are the suitable performance measures (that are consistent) to evaluate the performance of my clustering algorithm?

1 Answers1

1

Please see a text book on this subject.

It's so widely known (ARI, NMI, etc.) that it's even discussed in Wikipedia...

Also use the search function, e.g., Evaluation measures of goodness or validity of clustering (without having truth labels)

While that question at first sight is on unlabeled evaluation, reading the answers will guide you to the extrinsic evaluation, too.