I've got several confusion matrices, all of binary classification (negative, positive). I would like to get general scores of all the matrices combine. Problem is, that the data is not balanced at all. For example, lets look on the following 2 sets of values:
1. Number of positive examples: 2, number of negative examples: 298
----------+---------
| TN:290 | FP:8 |
--------------------
| FN:1 | TP:1 |
----------+---------
Number of positive examples: 46, number of negative example: 254
----------+----------
| TN:233 | FP:21 |
| FN:20 | TP:26 |
----------+----------
So, taking an average of the metrics such as Precision, Recall etc. will not be a good representation. You can think of it as a one vs. all problem, changing the one each time and aggregating the result to get an overall performance, with consideration of the sample distribution.
By frequency you mean the rate of positive samples? – M.F Jan 11 '22 at 13:20