I have a highly imbalanced dataset and want to evaluate the performance of a score. Is it better to balance the data and calculate the area under the receiver operating characteristic or to use the imbalanced data and calculate a area under the precision recall curve?
Asked
Active
Viewed 17 times
1
-
See https://stats.stackexchange.com/questions/90779/area-under-the-roc-curve-or-area-under-the-pr-curve-for-imbalanced-data – Henry May 21 '22 at 11:29