When building or evaluating a predictive model, we know that a ROC curve can be useful for identifying the optimal cutoff/threshold/decision point in a classification problem with a dichotomous categorical response variable and imbalanced outcome classes. We simply find that cutoff point on the ROC curve which is closest in distance to the coordinate (0,1), which represents the perfect model with 100 percent sensitivity and specificity. This approach has worked quite well for me when the true positive rate and the true negative are equally important.
My question is a theoretical one. A gain curve takes a different approach to portraying or describing a model's performance. The gain curve for my model is shown in blue in the following plot:
The blue point on the gain curve corresponds to the optimal cutoff of 0.254, as determined by distance to (0,1) on the ROC curve. The gain curve tells us when we use 0.254 as the cutoff, and we test each observation in order of event probability (highest to lowest, represented by the red curve), the model identifies or finds about 83 percent of events after testing only about 34 percent of observations.
Had my model been perfect, its gain curve would have extended in a straight line from (0,0) to the gray point at the upper-left vertex of the light gray triangle, located at (0.247, 1). The event prevalence is 0.247. That is to say, a perfect model would have found all 25,655 events after testing the first 25,655 observations, resulting in 100 percent sensitivity and specificity. The model's estimated event probabilities would have matched actual event probabilities.
Finally, my question: If what I have stated above is correct, then shouldn't the blue point (corresponding to the optimal cutoff according to the ROC curve) represent the point on the gain curve closest in distance to the gray point (which represents the termination point of a gain curve for a perfect model)? It does not, if my calculations are correct. And it's not even close. I'm left with the feeling that I must be mixing apples and oranges, but can't quite get my head around it.
