Logistic regression predictions dont work

Question

I have this problem with logit, that when I want to create confusion matrix, it simply displays the real values in the first row and in the second row, there are never any numbers. I created a lot of models (individually for each country where I analyse occurence of an event 1-it happens, 0-it does not happen), but each logit has this problem.

I guess I am doing something wrong. So far, I checked for all the necessary assumptions, the only thing I didn't do, was k-fold cross validation and I did not divide model on training and testing set. Could that be a reason? Could someone explain why?

Related CV posts to help illustrate that logistic regression makes probabilistic predictions (probabilities between 0 and 1) and how to convert them to 0/1 labels: here, here and here. — dipetkov, Apr 24 '22 at 10:41
Logistic regression is not meant to be used as a classifier, and the choice of accuracy measures should reflect that. See https://www.fharrell.com/post/mlconfusion/ and use proper continuous accuracy scores. Analysis of binary outcomes is all about estimating tendencies (probabilities) not about forced choice classification. — Frank Harrell, Apr 24 '22 at 12:00

score 0 · Answer 1 · answered Apr 24 '22 at 09:17

0

The inverse logit function produces continuous values strictly between 0 and 1, while a confusion matrix is based on predictions in $\{0, 1\}$.

So either you dichotomize the continuous predictions (which involves choosing a threshold), or you work with scoring measures like the logloss or AIC that can directly work with continuous predictions.

answered Apr 24 '22 at 09:17

Michael M

11,815
5
33
50

So it is expected that this number of accuracy that I get, is wrong? – Maria R Apr 24 '22 at 10:57

Logistic regression predictions dont work

1 Answers1