29

I am interested in knowing what really happens in Hellinger Distance (in simple terms). Furthermore, I am also interested in knowing what are types of problems that we can use Hellinger Distance? What are the benefits of using Hellinger Distance?

Smith Volka
  • 665
  • 2
  • 6
  • 13
  • 13
    The Hellinger distance is a probabilistic analog of the Euclidean distance. A salient property is its symmetry, as a metric. Such mathematical properties are useful if you are writing a paper and you need a distance function that possesses certain properties to make your proof possible. In application, someone might discover that one metric produces nicer or better results than another for a certain task; e.g., the Wasserstein distance is all the rage in generative adversarial networks – Emre Aug 31 '17 at 05:43
  • Thank you for the comment. I came across this question, which is quite similar to the question I have now. https://datascience.stackexchange.com/questions/22324/combine-two-sets-of-clusters Please let me know, why the answer says Hellinger Distance is suitable? – Smith Volka Aug 31 '17 at 05:59
  • 3
    Probably to visualize the topics in a metric space. Another nice property is that the Hellinger distance is finite for distributions with different support. It is good that you are asking these questions. I suggest trying different metrics for yourself and observing the results. – Emre Aug 31 '17 at 06:06
  • Thanks. its a good link. helps a lot. But is Hellinger distance only limited to topics derived from Latent Dirichlet Allocation (LDA) as mentioned in the link? – Smith Volka Aug 31 '17 at 06:11
  • 2
    No, it has no inherent connection to LDA. – Emre Aug 31 '17 at 06:12
  • Thanks. The points you mentioned helped a lot to understand. – Smith Volka Aug 31 '17 at 06:17
  • @Emre Please let me know if you know an answer for this https://datascience.stackexchange.com/questions/22828/clustering-with-cosine-similarity – Smith Volka Sep 05 '17 at 06:22

1 Answers1

12

Hellinger distance is a metric to measure the difference between two probability distributions. It is the probabilistic analog of Euclidean distance.

Given two probability distributions, $P$ and $Q$, Hellinger distance is defined as:

$$h(P,Q) = \frac1{\sqrt2}\cdot \|\sqrt{P}-\sqrt{Q}\|_2$$

It is useful when quantifying the difference between two probability distributions. For example, if you estimate a distribution for users and non-users of a service. If the Hellinger distance is small between those groups for some features, then those features are not statistically useful for segmentation.

timleathart
  • 3,940
  • 21
  • 35
Brian Spiering
  • 21,136
  • 2
  • 26
  • 109
  • 6
    (also for @Emre) where does the claim that hellinger distance is the probabilistic analog of euclidean distance comes from? why is that? – carlo May 21 '20 at 16:06