If going with the opposite prediction of a bad predictor gives good predictions, why not do that?

Question

Let’s restrict our consideration to binary outcomes.

I have a friend who is terrible at predicting the future, always predicting the opposite of what winds up happening. For instance, my friend predicted Cincinnati to win the Super Bowl, but they lost. My friend predicted that Croatia would beat France for the 2018 World cup, but France won. Et cetera...

That is, my friend's predictions give me excellent predictive accuracy if I go with the opposite of what he predicted. “My friend always gets it wrong, so bet on Los Angeles to beat Cincinnati,” I should have thought a few weeks ago. “Bet on France to beat Croatia,” I should have thought four years ago.

What is wrong with applying this logic to a classification or probability machine learning model that consistently predicts the opposite, measured by $AUC\approx 0?$ I realize that the coefficient estimates should not be trusted, but if we care about the prediction above all else, the danger is not so apparent to me.

That is, if I get $AUC = 0.2$ from the probability predictions $p_i$ of a model, why not take $1-p_i$ as my final predicted probability?

EDIT

$AUC<0.5$, McFadden’s $R^2<0$…whatever metric convinces you that you’re consistently predicting the opposite of what you’re seeing (though I have only ever seen this suggested when the $AUC$ is less than $0.5$, which I’m pretty sure is how some popular software packages approach ROC and AUC calculations).

For binary outcomes... maybe... hard to apply for multi-class. — user2974951, Feb 24 '22 at 13:35
@user2974951 Yes, it gets complicated for multiple classes. I’ve edited to clarify that I want to consider the binary case. — Dave, Feb 24 '22 at 13:43
Can I get an easy upvote-and-accept by simply answering "of course, go ahead, do that and have a nice day"? Seriously, your friend is a wonderful predictor, feed them into a model and buy them a cup of coffee. Alternatively, if you need to consider your friend a model and not a predictor, then what they need is just some post-processing, AKA classifier calibration. — Stephan Kolassa, Feb 24 '22 at 14:01
@StephanKolassa Not quite, but if you tie this back to calibration, that would make for an interesting answer! (It seems like that would be something like Platt scaling.) — Dave, Feb 24 '22 at 14:04

score 1 · Answer 1 · answered Jan 08 '24 at 19:14

This is a valid approach to having a two-step pipeline that makes your final predictions.

Get the model predictions that are reliably terrible (the friend who always predicts the wrong team to win).
Transform the model predictions so they are reliably good (the friend always picks the wrong team to win, so I pick that team to lose and always pick the correct winner).

This is a calibration step, not particularly different from any other calibration step, just with a sign change to the log-odds. However, keep in mind that the final predictions are made by this two-stage pipeline, not just by the original model.

I would have doubts about the first stage of the predictor, but being thorough to verify that, yes, the first stage is reliably wrong should be able to set up a calibration step that, when used on the original model predictions, makes for good final predictions.

(I am reminded of a line from the show Scrubs, something along the lines of, "You're typically wrong, so if I'm in doubt, I just go with the opposite of what you think." If that colleague really tends to get it wrong, then going with the opposite will tend to get it right.)

score 0 · Answer 2 · answered Feb 25 '22 at 12:26

0

Your question: "That is, if I get $AUC=0.2$ from the probability predictions $p_i$ of a model, why not take $1-p_i$ as my final predicted probability?"

If you did that, then, indeed, your AUC would improve. But your prediction probabilities could be bad. Imagine, all your original prediction probabilities are very high, e.g. $p_i > 1-10^{-6}, i=1,\ldots, N$, but still your $AUC$ is only 0.2 (this is of course possible). Then, when using $1-p_i$, you would predict that nothing is ever happening.

The problem here is that $AUC$ doesn't care about the values of your $p_i$, only about their order.

answered Feb 25 '22 at 12:26

frank

10,797

1

This seems more like a criticism of AUC in general than of taking the opposite probability value. – Dave Feb 25 '22 at 12:29
1

@Dave You asked why not take $1-p_i$ when the AUC is 0.2, and I pointed out why not. – frank Feb 25 '22 at 12:33

If going with the opposite prediction of a bad predictor gives good predictions, why not do that?

2 Answers2

Linked