4

I already referred this post post and post. Don't mark it as duplicate.

I am working on a binary classification problem using logistic regression. Loan default or not.

I have a requirement where I am told to predict the outcome classes for our unseen records and also report the confidence of predictions generated.

I am novice data scientist. So, am not really sure what is the difference between likelihood and confidence?

If I run logistic regression, I get likelihood measure like 70% probability of belonging to class 1 and 30% of probability of belonging to class 0.

My questions are as follows

a) Does likelihood and confidence mean the same? Is there any simple explanation that ordinary layman like me can understand.

b) Is there any tutorial that you can suggest which has on how to report confidence of predictions? When I use scikit-learn for logistic regression, I don't know how can I report confidence for the predictions?

c) Any idea on how can we generate the interval? In scikit-learn logistic regression tutorials that I find online, I only see probability/likelihood of an instance becoming label 1 or label 0. Can you guide me on how can we generate interval?

The Great
  • 3,272
  • You’re asking about the “likelihood” of an event, but that does not appear to be part of your requirements. Did your boss ask for a “likelihood” of an event? // 2) Consider asking your boss for clarification on what exactly is needed. // 3) Do you have an explicit requirement to give a confidence interval? I can think of a way to report the results without giving a confidence interval, yet I would answer the questions about predicted class membership and confidence about loan default.
  • – Dave Jan 18 '22 at 01:30
  • 1
  • This seems like a perfect situation to predict tendency, not membership. Making a hard classification requires knowledge of what’s at stake. What do you lose when someone you predicted as safe winds up defaulting? What do you lose when someone safe is falsely regarded as risky and not approved for a loan? Those will influence how you use the probability output of your logistic regression. You don’t have to use the software default of $0.5$ as your decision threshold.
  • – Dave Jan 18 '22 at 01:35
  • Hi @Dave - Yes, I wanted to compute/predict the likelihood and the confidence of my predictions. For ex: I can say that person A has a 90% likelihood of his loan being approved. But how confident am I about the inference/conclusion that we make from the model results. That's what they wish to know. Likelihood and confidence – The Great Jan 18 '22 at 02:28
  • Yes, I do have an explicit requirement to report results with confidence (interval) etc. Can direct me to any resources that involves likelihood determination as well as confidence of the results?
  • – The Great Jan 18 '22 at 02:30