ROC (including AUC) metrics are widely used for binary classification problems. AUC is usually selected to evaluate the model.
However, some tasks may require high specificity with fixed sensitivity. That is, you would move the cutoff and get the lowest sensitivity which is larger than the pre-specified sensitivity, and then compare algorithms on specificity. The problem for specificity as an evaluation metric is the high variability. Because there could be many samples with exactly the same predicted score near the fixed sensitivity, the calculated specificity may be underestimated.
Is there any way to make specificity a more stable evaluation metric?
Is that possible to design a loss function to achieve the high specificity with targeted sensitivity?