8

I have two sets of predictions. One prediction has an R2 = 0.57, the other R2 = 0.51 .

Plotting histograms of the predictions shows that the set with R2 = 0.51 looks more accurate as compared to the predictions with R2 = 0.57.

Also, the plot with the low R2 value captures the tails of the distribution.

Can we correlate R2 with histogram distribution?

Or what other metrics could I use to estimate error or how good are my predictions? (apart from RMSE or MSE)

R2 = 0.51

R = 0.51

R2 = 0.57 R = 0.57

  • 1
    I would recommend that you first decide which functional of the unknown true distribution of the actuals you wish to predict (the conditional expectation, or the conditional median, or another conditional quantile, etc.), and then pick an error measure that will elicit that functional. This may be helpful. – Stephan Kolassa Nov 08 '23 at 15:38

1 Answers1

16

The histogram completely misses the pairing between predicted and observed values, so the histogram has nothing to do with the all-important residuals used to calculate $R^2$. For instance, if you permute the predicted values, you do not change the histogram, yet the prediction quality certainly changes.

In fact, you could predict the correct set of values yet, because the predictions are not paired with the observations to which they should correspond, your ability to predict is terrible. For instance, consider the true values to be $(1, 2, 3, 4, 5)$ and the predictions to be $(5, 1, 4, 2, 3)$. These distributions are identical, yet the predictions are far from perfect. Contrast that with predictions of $(2, 2.5, 3, 3.5, 4)$, which have a different distribution but are much better predictions.

While there is some value to having predictions that vary the way the true observations do, you will find it much more informative to graph a scatterplot of the predicted and true values.

Dave
  • 62,186
  • 7
    +1. In addition, your predictions should vary less than the actuals, because actuals are also affected by noise, which you cannot and should not be predicting. – Stephan Kolassa Nov 08 '23 at 15:36
  • Interesting. But in my case I am more interested in the tails because those are the failed specimen. So I was thinking of using the model which predicts the tails better (the first plot with a low R2). Do you think that makes sense? – ggoogle userr Nov 08 '23 at 15:42
  • 3
    @ggoogleuserr The histogram gives you no sense of whether or not tail points are predicted well. Sure, you get more extreme points in one set of predictions than the other, but that is not a guarantee of those extreme predictions corresponding to the extreme observations. // If you are interested in predicting the extremes at the expense of predicting the middle values, that warrants its own separate question. – Dave Nov 08 '23 at 15:44
  • @Dave Thanks. I should check whether or not the predicted extremes correspond to the actual extremes.

    But the plots I have shown are not an excessive situation. I am adding a variable to my model. And with its addition, R2 drops but the (predicted and actual) histogram tails start matching. So I am guessing the overall performance of the model drops but extreme values are better predicted (given that I first confirm that the predicted extreme values correspond to the actual extreme values)

    – ggoogle userr Nov 08 '23 at 15:53
  • 1
    @ggoogleuserr How are you calculating $R^2$ to get it dropping upon adding a variable? Are you considering an out-of-sample or adjusted $R^2$, for instance? $//$ Yes, your predicted extreme values probably correspond with the observed extreme values, but until you use an analysis technique that considers this pairing, you don't know that. $//$ Some of the trouble with predicting the extreme values is that you have to consider if extreme values are predicted to be extreme but also if extreme predictions only correspond with extreme observations. – Dave Nov 08 '23 at 15:54
  • I am adding the variable in the beginning of the analysis, so no adjustments or out-of-sample considerations. R2 is calculated using sklearn's r2_score. I'll keep the information in mind. Thanks Dave. – ggoogle userr Nov 08 '23 at 16:15
  • @ggoogleuserr What kind of model are you using? The sklearn $R^2$ should not decrease in-sample when you add a variable to a linear model. – Dave Nov 08 '23 at 16:17
  • 2
    Your predictions are probably conditional expectations; or at least your model attempts to output these. Almost by definition, "extreme" observations are observations that do not come from extreme conditional expectations, but are simply cases of extreme noise. Thus you should not expect extreme observations to be predicted by extreme predictions. This is actually just a different way of wording my earlier comment above: do not conflate predictable signal with unpredictable noise. If you need to get extremes under control, use either quantile predictions or extreme value theory. – Stephan Kolassa Nov 08 '23 at 23:12
  • 1