I am working on a binary classification problem with relatively few instances (e.g. ~30 instances out of which ~7 are positives).
I have noticed that when using 2-fold the average classification performance of the best performing model is better than the best performing model with 5-fold.
In fact,
The best performing model in 2-fold CV gets the following scores across the two folds:
[0.82, 0.82](avg. = 0.82).That model is different from the best one I get with 5-fold CV, which yields the following AUC scores:
[0.4 , 1. , 0.75, 0.75, 0.25](avg = 0.64).
This takes me to the following question: Which model should I use? And why would I ever get a better model when training with less data?
[0.16, 0.31](not[0.12, 0.41]). Not sure why we get different numbers. Which implementation did you use? I am using the formula here: http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Wilson_score_interval which was implemented here (translated withups= # posanddowns = n - pos) – Amelio Vazquez-Reina Jul 10 '13 at 20:24require(Hmisc); ?binconf. Typebinconfto see the code. I think it agrees with the reference you listed. Code is pretty easy to read. – Frank Harrell Jul 10 '13 at 20:57