3

I am testing the accuracy of discrete variable prediction (>= 2 possible outcomes). I've seen things like using a confusion matrix or ROC curve for binary outcomes, but not much for > 2 outcome variables.

What are good measures of accuracy for discrete variables other than classification accuracy?

sma
  • 233

2 Answers2

0

One of the accuracy measures i always use is kappa statistic and no information rate.

If you are using R programming, this is availabe in caret package and confusionMatrix() function.

Some of the other measures are true positive rate, false negative rate.

Hope this helps

Not_Dave
  • 206
  • 2
  • 6
0

One of the most important considerations is if you truly have discrete classification predictions. That you are analyzing the ROC curve for a binary outcome tells me that you do not have discrete predictions but something on a continuum like a predicted probability or log-odds (as would come from a logistic regression). Especially if there is a reasonable interpretation of the predictions as probability values (this is often but not always the case), this opens up a world of proper scoring rules that evaluate the predicted probabilities. Among the advantages of this is that it allows for more nuanced decision-making. Frank Harrell discusses these extensively in two great blog posts and all over his posts on Cross Validated.

Damage Caused by Classification Accuracy and Other Discontinuous Improper Accuracy Scoring Rules

Classification vs. Prediction

Among the common evaluation metrics for these kinds of predictions are the Brier score and log-loss. These can be normalized to Efron's and McFadden's pseudo $R^2$ scores to give possibly easier interpretations that are $1$ for perfect predictions and less than $1$ for imperfect predictions, analogous to $R^2$ in regression.

The equations in the multi-class setting are messy, but they work about how you would expect when you generalize from the binary to multi-class setting.

Dave
  • 62,186