I constructed a classification tree and want validate the out of sample performance. I read that the accuracy or the balanced accuracy must at least higher than the no information rate. By the no information rate i mean the accuracy of a model that just predict the class that is the most in the data set. But which values of the accuracy are acceptable? I mean values over 0,7, or 0,8? Is there a rule of thumb or something like that. I didn't find a paper with a scala.
Asked
Active
Viewed 1,454 times
1
-
Possible duplicate of How to know that your machine learning problem is hopeless? – Stephan Kolassa Apr 25 '19 at 11:29
-
1Do not use accuracy to evaluate a classifier: Why is accuracy not the best measure for assessing classification models? and Is accuracy an improper scoring rule in a binary classification setting? – Stephan Kolassa Apr 25 '19 at 11:29
-
@StephanKolassa: I do not think that even if the OP used a proper scoring rule like CRPS the issue mentioned would be resolved. I think that the question is about "how long my piece of string needs to be"? – usεr11852 Apr 25 '19 at 11:40
-
@usεr11852: I think the proposed duplicate answers the question about "how good is good enough" (in exactly the way your answer does, which I upvoted). In addition, I point out that accuracy and similar are not good KPIs in the first place. – Stephan Kolassa Apr 25 '19 at 11:45
-
@StephanKolassa: I misinterpreted your comment then! Thank you for clarifying! :) – usεr11852 Apr 25 '19 at 12:07
1 Answers
1
There is no definite threshold about what is a "good number" because such a threshold would be application specific. If the state of the art is 55% Accuracy and we get 56% we get great. If the state of the art is 99% Accuracy and we get 98.9% we are not doing that great (but maybe we are faster, less memory hungry, etc. etc.).
usεr11852
- 44,125