2

I am reading up on distance and dissimilarity measures for my class on natural language processing and could not understand this slide. Why does the dissimilarity measure not satisfy item 3 ? What would an example be ?

enter image description here

Kong
  • 361
  • 1
  • 17
  • To me, it seems point 1 implies point 3. Not sure how you could have a metric that satisfies 1 and not 3. – colorlace May 29 '18 at 18:51
  • These are the three axioms of metricity (https://en.wikipedia.org/wiki/Metric_(mathematics)). #3 is the "triangular inequality" axiom. – ttnphns May 29 '18 at 19:01
  • A note on terminology. Some authors define "distances" as metric dissimilarities. Thence, there are metric dissimilarities (=distances) and nonmetric dissimilarities. Other authors equate "distances" and "dissimilarities" to be synonyms. Thence, for them there are metric and nonmetric dissimilarities (= distances) – ttnphns May 29 '18 at 19:06
  • Thanks guys. Do you know if there is a page online where I get to see which distance satisfies / does not satisfy the 3 inequalities ? Im trying to select one for my project. So far I have seen that the KL and chi square do not satisfy the triangle inequality. – Kong May 29 '18 at 20:52

1 Answers1

0

First Question

Why does the dissimilarity measure not satisfy item 3?

The answer is that this is just the definition of a dissimilarity.

If a matrix fulfills the first two properties, the matrix defines a dissimilarity measure. If it additionally also fulfills property 3, the matrix defines a distance measure.

Thus, a distance measure is also always a dissimilarity, but not vice versa.

Second Question

What would an example be (of a dissimilarity that is not a distance)?

A counterexample is

\begin{bmatrix}0&0.5&0\\0.5&0&0.4\\0&0.4&0\end{bmatrix}

In this case, the distance (distance does not mean distance measure in this case!) between observation 1 and observation 2 is 0.5. Between observation 2 and observation 3 the distance is 0.4. Between observation 1 and observation 3 the distance is 0.

The first two axioms hold: -> dissimilarity measure

The third axiom, on the other hand, does not hold:

d(observation 1, observation 2) = 0.5 > 0 + 0.4 = d(observation 1, observation 3) + d(observation 3, observation 2)

Note

Such a distance matrix could occur from a numerical dataset for example for the correlation dissimilarity measure. This measure is discussed better in this post.

Seve97
  • 11