1

My logit model produces predicted probabilities which range from 0.003 to 0.49. My dependent variable is default, which is 1 if the loan has defaulted and 0 otherwise. This is probably why the predictions are so low.

However, it means it doesn't make sense for me to create a classification table with cutoff point 0.5. What would be my next best option to determine the percentage of concordant observations?

  • What is your loss? Depending on this, it might still make sense to set your cutoff at .5 – Taylor Apr 16 '17 at 22:11
  • Sorry, could you expand on what you mean by loss? – Robert Haffe Apr 16 '17 at 22:22
  • https://en.wikipedia.org/wiki/Loss_function For instance if the penalty of a false positive is nothing, go ahead and lower the cut-off. But if the penalty is high, then maybe keep it where it is – Taylor Apr 16 '17 at 22:32
  • also this https://en.wikipedia.org/wiki/Loss_functions_for_classification – Taylor Apr 16 '17 at 22:34
  • So in my classification table 15.12% were predicted to be repaid, however they defaulted. 0% of the repaid loans were predicted to default (based on the classification table with cutoff .5) – Robert Haffe Apr 16 '17 at 22:40
  • 84.88% of those that repaid were predicted to repay. 0% of those that defaulted were predicted to default – Robert Haffe Apr 16 '17 at 22:42
  • possible duplicate https://stats.stackexchange.com/questions/25389/obtaining-predicted-values-y-1-or-0-from-a-logistic-regression-model-fit/25398#25398 – Taylor Apr 16 '17 at 22:51
  • So its "good" for specificity (%) = sensitivity (%)? At a cutoff point 0.175, they are almost equal, yielding a correctly classified amount of 65%. Would this be the best way of going about it? (Just confirming I've interpreted the post correctly) – Robert Haffe Apr 16 '17 at 23:10
  • Or actually, would I want sensitivity to be higher? Since defaulted loans are more damaging than the benefit from repaid loans. – Robert Haffe Apr 16 '17 at 23:12

0 Answers0