0

The 2 results I got for bagging and random forest are shown below. It seems that calculating mean MSE from bootstrapping also result in a lower mean MSE for bagging as compared to random forest. Is bagging a better predictive model in this case?

best_bag_model_all

Call: randomForest(formula = Balance ~ ., data = bank, mtry = 60, importance = TRUE) Type of random forest: regression Number of trees: 500 No. of variables tried at each split: 60

Mean of squared residuals: 259811.3 % Var explained: 97.8

best_rf_model_all

Call: randomForest(formula = Balance ~ ., data = bank, mtry = 8, importance = TRUE) Type of random forest: regression Number of trees: 500 No. of variables tried at each split: 8

Mean of squared residuals: 279642.4 % Var explained: 97.63

1 Answers1

1

Because you are not splitting the data into training and testing. Bagging will always give a better in-sample fit because it included all variables while random forest gives better out-of-sample predictions.