Lets assume I have 24 random forest models. Each of 24 random forest models produces a class prediction. I am currently using simple majority voting to select final prediction. Can someone please guide me in the direction of other options that I can use? Specifically , weighted or using another model on top of this? I see that out of 24 random forests, some have quite bad recall and f1 scores, some are also really badly calibrated. Thus, I want to develop a voting method that takes this into consideration.
Asked
Active
Viewed 36 times
1 Answers
1
Here are a couple of options that would be straightforward for you to implement:
What you are doing is called hard voting i.e. each model votes on the class outcome. A likely better option is soft voting, in which each model's vote is weighted by its confidence. I've explained this in more detail here: Hard voting, soft voting in ensemble based methods
Another way to integrate the information in the separate models is stacking. This involves training a different model on the predictions of the individual models in the ensemble.
In general, there's only a few strategies that are generally used to integrate different machine learning models, discussed well in this thread: Bagging, boosting and stacking in machine learning
mkt
- 18,245
- 11
- 73
- 172