4

I am learning XGBoost these days. I read a lot of slides/tutorials. However, most of them focus on how a tree grows internally, e.g., how the split is made.

I couldn't figure out how the 'forest' grows in xgboost --- in the end, it is a forest-based algorithm.

My guess is that, XGBoost relies completely on hyper parameters to control the growth of forest. When max_depth is reached or split cannot be made due to too small gain, a new tree will be initiated.

When the max number of trees is reached, the forest will stop growing.

Is my understanding correct?

thanks

user152503
  • 1,489

1 Answers1

3

You can find an explanation in the original paper here: https://arxiv.org/abs/1603.02754

By default, XGBOOST will create a forest with exactly num_boost_round trees. The algorithm relies on gradient boosting, meaning that each tree is trained one-at-a-time. Training on the current tree stops if either the max_depth has been reached or if additional splits do not significantly raise accuracy.

Alex R.
  • 13,897