1

I created dummy variables (binary data) from categorical variables where I want to partition N subjects into multiple classes by some clustering method. I created a Jaccard similarity index matrix for all subjects, thus having N by N similarity matrix.

My question is, if it is OK to apply a hierarchical clustering using eucledian distance measure on the Jaccard similarity index matrix.

The result looks very good and valid. In fact much better than when I use the jaccard dissimilarity (1-Jaccard index) matrix. I want to make sure that I am not creating mathematical nonsense.

ttnphns
  • 57,480
  • 49
  • 284
  • 501
dmeu
  • 300
  • Well, yes. The actual clustering algorithm is agnostic of the metric used to construct the distance/similarity matrix. – Digio Sep 05 '17 at 14:31
  • eucledian distance measure on the Jaccard similarity index matrix This is misty. Jaccard similarity is a proximity measure. Euclidean distance is another proximity measure. Maybe you meant or, not on in that sentence? – ttnphns Sep 05 '17 at 15:33
  • Note that your data are initially nominal. I.e. they are dummy binary, not simply binary. An overview of measures to use with nominal attributes is here https://stats.stackexchange.com/q/55798/3277. – ttnphns Sep 05 '17 at 15:36
  • Hi, thanks @ttnphns. I will adapt the wording, you are right. And no, my question exactly is if it is valid to do on the jaccard index. I will try the Dice algorithm and check the performance. My reasoning would be that I create a continuous data set from nominal data (jaccard,dice) which then can be used with e.g. euclidean distance to perform a hierarchical clustering. – dmeu Sep 06 '17 at 07:43
  • Still using eucledian distance measure on the Jaccard similarity index matrix is not clear. It sounds as if you are going to see the jaccard matrix as some dataset and compute euclidean distances between its rows?? – ttnphns Sep 06 '17 at 08:41
  • Ok, should I rather just use the jaccard similarity index as dis(similarity) metric for the hierarchical clustering? – dmeu Sep 06 '17 at 11:12

0 Answers0