1

Hi, I'm quite new to statistics and have been tasked to evaluate if there is a difference in accuracy between 2 subpopulations in a logistic model.

The credit scoring company's model calculates the risk credit score (0-1 or 0-100%) of an individual defaulting and then they send this score to the banks who decide if they want to hand out a loan or not. Then a year later we get back info if the individual that had a loan approved have recieved a payment remark (binary 1 or 0). (We have no way of seeing for what risk scores they decide to give out loans)

Since we have one continous dependent variable and a binary outcome variable I've found calculating the acutal accuracy impossible and are now turning to Proper scoring rules to determine which model have the most "loss" in prediction compared to outcome.

I've found Brier score quite interesting for my occasion but there are so many scoring rules and few places document when it's best to use which scoring rule.

Would love an explanation, suggestion of alterative approach or link me to a place where this is discussed.

Thank you for your time and consideration.

  • 2
    Does this answer your question? Logarithmic loss vs Brier score vs AUC score The trick is to recognize that the choice of probability cutoff (after modeling) depends on relative costs of false-positive and false-negative class assignments. If you start with an estimate of those relative costs, you can in principle choose a scoring rule that emphasizes probabilities near the corresponding cutoff. – EdM Mar 15 '22 at 17:26
  • Thanks for the comment. Sadly I can not choose a cut of point given that the credit score is supplied to thousands of different banks. One bank might give out a loan to risk scores below 2%. Another 10%. Meaning no matter where I decide for my model to create a cut off point, it will not reflect reality since the banks all have a unique cut off point... – user18417954 Mar 16 '22 at 08:25
  • Hence why I got interested in brier score since it does not require a cut off point / Threshold. But to my knowledge, other scoring rules does not require a cut off point as well. Thus I have a hard time deciding on which one to use for which occasion. – user18417954 Mar 16 '22 at 08:32
  • As the linked duplicate thread indicates, the differences have to do with how much weight you put on different parts of the probability scale. As the answer from Dikran Marsupial there says, log loss puts a lot of weight on errors near the extremes. The Brier score doesn't, and has the advantage of being easy to explain as the binary equivalent of mean-square error. – EdM Mar 16 '22 at 14:45
  • I'm more concerned about whether your assigned task is feasible regardless of your scoring choice. Say you had a perfectly calibrated default-prediction model. Then a bank that only gave out loans to those with less than a 1% default risk would seem necessarily have a better score (of any type) than a bank that gave out loans to those up to a 50% default risk. I'd recommend that you ask a new question focusing on the "subpopulations" you've been asked to compare, with enough detail to help evaluate whether such a comparison makes any sense without information about probability cutoffs by bank. – EdM Mar 16 '22 at 14:51

0 Answers0