1

This is a clearer version of this question.

I have two different cases: one where I have just the final accuracy, and another with the four error types split out:

Case 1 (no error types):

Dataset size: 78,000
Classifier 1: 51% accuracy
Classifier 2: 64% accuracy

Case 2 (with error types):

Dataset size: 78,000
Classifier 1 True Positive:  21%
Classifier 1 True Negative:  30%
Classifier 1 False Positive: 25%
Classifier 1 False Negative: 24%

Classifier 2 True Positive: 15% Classifier 2 True Negative: 49% Classifier 2 False Positive: 31% Classifier 2 False Negative: 5%

In each of these cases, what statistical tests are available to determine whether the differences between the models are significant?

For each of these tests:

  1. What are the assumptions of the test?
  2. What extra information do I need to perform the test?
  3. What is the formula for performing the test?
  4. How should I interpret the results of the test?
Pro Q
  • 697

0 Answers0