1

Given the following table of predictions vs. actual states:

    A     B     C     D
A  11     9     4     3
B   2   120     3     6
C   1     1     9     3
D   0     0     0     0

You can see that none of the actual states are D. I wish to calculate some confusion table metrics (I can handle the multi-level issue), but the 0 cells for D mean that I'll have some undefined outcomes. Is it valid to add 0.5 to every cell in the table then calculate the metrics? If so, does this have a formal name?

Glen_b
  • 282,281
Bryan
  • 1,202
  • The person's name is Yates, so the apostrophe conventionally follows the "s", rather than precedes it. – Glen_b Jul 06 '23 at 16:19
  • So, you have nothing substantive to add. – Bryan Jul 06 '23 at 16:23
  • Okay, I'll fix your question for you. Since people use the search facility to find answers, names - especially in titles - matter. But keep shooting barbs instead of improving your own question if that helps you get through the day. – Glen_b Jul 06 '23 at 23:12

1 Answers1

0

I find the idea to be problematic, particularly in your situation.

You have some rather small numbers in your table, such as $1$. By adding $0.5$ to every value, you increase the size of that value by a considerable amount: $1$ to $1.5$ represents a $50\%$ increase. This will distort your downstream calculations, perhaps quite dramatically.

Further, if you do the calculation with the zero-cells as $0.5$, you are giving incorrect values. The fact is that you never predicted category $D$. If you start reporting that some percent of the items predicted as category $D$ actually were category $D$, I do not see how such reporting reflects the reality of your results.

Dave
  • 62,186
  • It could also be worth considering if you should be evaluating these metrics that are derived from the confusion matrix, which are more problematic than you might think. – Dave Jul 06 '23 at 23:29