2

I am trying to understand the trade-offs between different metrics for evaluating the performance of different classification methods for multi-label data.

One option commonly found in the literature is the "Hamming" Loss, which is defined as the fraction of wrong labels over the total. Another option is to assess the goodness of "probabilistic predictions" for each label using, for example, a log-likelihood loss function.

One trade-off is likely to occur when data are "sparse" (few labels, many zeros), as the Hamming loss is relatively less "sensitive" to different models.

Are there other conditions under which one should choose one or the other?

mrb
  • 984
  • I would go with log-likelihood loss since it can easily be penalized by the number of parameter of a model to give a Akaike criterion. But I didn't know the issue about Hamming loss and sparse data. – Pohoua Aug 19 '20 at 21:34
  • The Hamming loss is just one minus accuracy, which has major issues per the duplicate. The log loss is a proper scoring rule. – Stephan Kolassa Dec 08 '23 at 23:19

0 Answers0