3

In leave-one-out cross validation, at each iteration, my test set is composed by only one data point - precisely the "left out", to be compared with the predicted one, using the estimated coefficients from the train set. Normally, for the train set, one would compute the $R^2$ over several observations and fitted values. For the test set, how should one compute the $R^2$ between a pair of numbers, namely one observed and another predicted value? Is there a common convention on how to tackle this?

gunes
  • 57,205
ouranos
  • 519

1 Answers1

2

Gather all your predictions for the entire set (since each prediction is generated via CV, it's unbiased), and calculate the $R^2$ on all true vs prediction.

gunes
  • 57,205
  • I see, that solves. I guess I could bootstrap on this to estimate the uncertainties for this R2, right? – ouranos Apr 28 '20 at 09:00
  • Yes, you can do it. – gunes Apr 28 '20 at 09:08
  • Do you have a reference that one can compare different models like that? (i.e. for each model: perform leave one out crossvalidation, gather all predictions on left out data and the corresponding true label and compute R^2 for each model based on all y_pred and y_true) – Ggjj11 Oct 17 '22 at 12:01
  • @Ggjj11 one google search ends up here: https://stats.stackexchange.com/questions/12412/how-do-you-generate-roc-curves-for-leave-one-out-cross-validation

    And, there is no other way for ROC curves for example.

    – gunes Nov 05 '22 at 15:42