I already referred this post.Don't mark this as duplicate.
I am working on a binary classification problem using algos like random forest, extra trees and logistic regression. dataset shape is 977, 6. class ratio is 77:23
In terms of our metric of interest f1, random forest seemed to do better followed by extra trees and then last is logistic regression
However, in terms of calibration, I see that logistic regression is well-calibrated (not surprised), followed by extra-trees and last is random forest.
But my question, why does logistic regression have higher brier score loss when compared to random forest (which doesn't have inherent calibration capability as log reg)?
Shouldn't the logistic regression brier score loss be the smallest, followed by extra trees and last is random forest?
Please find the graphs below



OptimisimBootstrapclass? Because, in the code below for linear regression, I don't see explicit calls for split function. So, trying to understand how this works. This question mainly due to my limitation with python (and am learning) – The Great Apr 03 '22 at 03:36OptimismBootstrapas an object tocross_validatethesplitmethod is called in order go actually get the cross validation folds. It would behoove you to take a peek atsklearn.model_selection.KFold's documentation and source code as I borrowed a lot from there (also, my code is not production worthy, it is only for demonstration purposes). – Demetri Pananos Apr 03 '22 at 04:07