We have generated an elastic net model on a small dataset, where we use gene expression data to calculate a biomarker score to discriminate patients with condition X vs controls.
The dataset is too small to create any meaningful validation sets, so we use nested LOO-CV for performance estimation and the model seems to work reasonably well.
Now, we want to recruit more patients to the study to validate our model further. The question my colleagues are asking is: can we somehow calculate how many patients we should be recruiting? (so the ML equivalent of a statistical power calculation, I guess)
My gut feeling is that the answer is no (the more samples, the better!), but what we could do is decide what kind of tests we want to do with the biomarker scores and do a power calculation for that, i.e. if we will run a t-test to compare biomarker score distributions in control vs patient, we could calculate the sample size to have statistical power in that test.
Am I thinking about this right? Anyone has any pointers? Or materials I could read?
Thanks!
EDIT: Further details on LOO-CV:
For each fold, we took one sample away and we split the remaining set for hyperparameter tuning (also by LOO-CV), and then we used the best model for prediction of on the one sample we took out. We compared the values for both hyperparameters and parameters for each fold, and we saw they are all quite similar. To get a final model, we re-trained on all our samples to create a final "equation" we could use to give scores to new samples (which is what we would like to validate).