Assessing the performance of a neural network in classification task that involves more than two class types

Question

In terms of evaluating how well a neural network performs in a classification task with the number of classes greater than 2 (for example, classifying an observation into one of the 4 classes), which would be a better measure to use: (i) error-based measures such as the cross entropy loss, or (ii) strict accuracy rate?

And what would be the advantage(s) of one measure over the other?

The previous posts on this topic only discusses about the case of binary classification.

Thank you.

Does this answer your question? Why is accuracy not the best measure for assessing classification models?. Also of possible interest: https://stats.stackexchange.com/questions/464636/proper-scoring-rule-when-there-is-a-decision-to-make-e-g-spam-vs-ham-email — Dave, Oct 12 '20 at 21:40
no these don't answer my question directly since I am thinking of multi-class classification task — HDB, Oct 12 '20 at 22:23
For one, cross entropy takes into account the continuous valued output (probability) output of the classifier, whereas only class assignments matter for accuracy. If you care about how "confident" or not your classifier is, then cross entropy is better. If not accuracy would be preferable. See also https://stats.stackexchange.com/questions/272754/how-do-you-interpret-the-cross-entropy-value — bogovicj, Oct 13 '20 at 01:30
Both accuracy and Brier score are defined for the case of multiple classes, so I think that the distinction you're making between binary and $k$-class classification is spurious. Without some elaboration on to help motivate why you think that the linked questions about accuracy or Brier score do not generalize to $k>$ classes, I don't think this question, as currently written, can be answered. — Sycorax, Oct 13 '20 at 01:39
@ Dave the best answer to the question why is accuracy not the best measure for assessing classification models? argues that the error-based measures are better than the accuracy, based on the threshold. In multiclass classification task, I don't think we have a strict threshold for choosing the correct answer since the predicted answer is chosen by taking the class that has the maximum probability among all $m$ classes. So I think that the answer does not strictly apply to my question here. Am I wrong? Please correct me if I am wrong. — HDB, Oct 13 '20 at 16:33

Assessing the performance of a neural network in classification task that involves more than two class types

0 Answers0