0

I have few questions concerning the selection of hyperparameters for predictions in cross validation.

If I understand well, during the CV, you just create folds (inner and outer for a nested CV), and then, for each fold, you train your algorithm, and make predictions on the test set, right ? Then, to estimate unbiased performances of your model, you calculate the average and the SD of all folds. And, with this "average" performance, what do you do with this ? I mean, I can't extract global "hyperparameters" optimized I guess... And can't i predict on a whole new dataset then ?

If someone could explain me the mechanism behind this^^

For example, I computed manually a nested cross validation (5 folds inner and 5 outer) on my datas, and I obtain these results on my validation set to select the best model. These are average with sd performances. What can I do with this ? enter image description here

Nicolas
  • 13
  • Not really but It helps...I still don't really understand how to select the model during cross validation. Do I must look the results on the validation (average AUC) for each model tuned ? Or do I just look at the results on each test ? (average AUC). Then, each k have results for each model with a specific best combination of hyperparameters, do I need to ensure that the results have a p < 0.05 ? (Michael said no and I don't really understand why, bc we are in a position of exploration ?) When I want to save my model for predictions on new datas, I must choose which parameters ? – Nicolas Aug 11 '23 at 12:47

1 Answers1

0

You almost got it right. The key is: you usually don't do anything with the CV models except to make decisions. Afterwards, you would train a final model on all folds.

Regarding your first comment: Not sure if I got your idea. The typical reasoning is like this: For m parameter combinations, I calculate the CV AUC (= average AUC over all folds). Then, a good candidate model would use the parameter combination with the best CV AUC.

Regarding your second comment: No (due to different reasons). For instance: What is the null hypothesis? (It is not clear). Secondly: The AUC values of different folds are highly dependent, invalidating the basic assumption of most statistical tests.

Michael M
  • 11,815
  • 5
  • 33
  • 50
  • I was thinking about something. I'm stupid, I need time to understand so let me know if I understood well : If I compute for each fold the AUC of the best combinaisons of hyperparameters, I can make my decision on the choice of the best combinaison with the best AUC with the lower SD, right ? and then create my final model – Nicolas Aug 11 '23 at 11:09
  • Second question, do I need to test if all my AUCs of each fold have a p_value < 0.05 ? (with a wilcox test for a binary prediction for example) – Nicolas Aug 11 '23 at 11:16
  • Thanks Michael it's more clear for me. But at this moment, when you compare the models, you use the AUC obtained on the validation set ? or the test set ? Because if you select your algorithm on the test set, you have a selection bias since all models must be selected on the valid set – Nicolas Aug 11 '23 at 12:14
  • @Nicolas sometimes CV fitted parameters are averaged to obtain the final parameter set instead of re-fitting on all folds – Aksakal Aug 11 '23 at 13:19