2

Lets assume I have 24 random forest models. Each of 24 random forest models produces a class prediction. I am currently using simple majority voting to select final prediction. Can someone please guide me in the direction of other options that I can use? Specifically , weighted or using another model on top of this? I see that out of 24 random forests, some have quite bad recall and f1 scores, some are also really badly calibrated. Thus, I want to develop a voting method that takes this into consideration.

mkt
  • 18,245
  • 11
  • 73
  • 172
MSKO
  • 61

1 Answers1

1

Here are a couple of options that would be straightforward for you to implement:

  1. What you are doing is called hard voting i.e. each model votes on the class outcome. A likely better option is soft voting, in which each model's vote is weighted by its confidence. I've explained this in more detail here: Hard voting, soft voting in ensemble based methods

  2. Another way to integrate the information in the separate models is . This involves training a different model on the predictions of the individual models in the ensemble.

In general, there's only a few strategies that are generally used to integrate different machine learning models, discussed well in this thread: Bagging, boosting and stacking in machine learning

mkt
  • 18,245
  • 11
  • 73
  • 172