Questions tagged [boosting]

A family of algorithms combining weakly predictive models into a strongly predictive model. The most common approach is called gradient boosting, and the most commonly used weak models are classification/regression trees.

A family of algorithms combining weakly predictive models (weak learners) into a strongly predictive model (super learner). The most common approach is called gradient boosting, and the most commonly used weak models are classification/regression trees. Another approach is likelihood-based boosting, and another weak learner is simple linear regression.

1462 questions
36
votes
1 answer

Mathematical differences between GBM, XGBoost, LightGBM, CatBoost?

There exist several implementations of the GBDT family of model such as: GBM XGBoost LightGBM Catboost. What are the mathematical differences between these different implementations? Catboost seems to outperform the other implementations even by…
Metariat
  • 2,526
  • 4
  • 24
  • 43
21
votes
1 answer

max_delta_step in xgboost

I am unable to fully understand how this parameter works from the description in the documentation [max_delta_step [default=0]] Maximum delta step we allow each tree’s weight estimation to be. If the value is set to 0, it means there is no…
11
votes
2 answers

lightgbm: Understanding why it is fast

Inspite of googling to the best of my ability, unfortunately I am unable to find reasons why lightgbm is fast. The lightgbm documentation explains that the strategy followed is 'Leaf-wise (Best-first) Tree Growth' as against 'Level wise Tree…
10
votes
1 answer

When do we use gblinear versus gbtree?

When do we choose gblinear boosting or gbtree boosting in xgboost library. I have a meteorological rain data with lots of missing values.
6
votes
3 answers

How to compute $g_i$ and $h_i$, i.e. the first and second derivative of the loss function in XGBoost?

In XGBoost, the objective function is $J(f_i)=\sum_{i=1}^{n}L(y_i,\hat{y}_i^{(t-1)}+f_t(x_i))+\Omega{(f_i)}+C$, If we take Taylor expansion of the objective function and let…
Naomi
  • 620
5
votes
1 answer

Difference between GBTree and GBDart

From my understanding, GBDart drops trees in order to solve over-fitting. However, when tuning, using xgboost package, rate_drop, by default is 0. I understand this is a parameter to tune, however, what if the optimal model suggested rate_drop = 0?…
5
votes
1 answer

Why do my XGboosted trees all look the same?

I am running an XGBRegressor that is supposed to predict a certain reward associated to different actions, which are one-hot encoded. For testing purposes I am using a small depth=2 and only 10 trees: XGBRegressor(base_score=0.5, booster='gbtree',…
Muppet
  • 215
5
votes
1 answer

Does Xgboost use feature subsetting?

Random forests select a subset of the features at each node and only considers those features as candidate splits. Wikipedia says this is sometimes called feature bagging. Does XGBoost also use this technique, in its tree learning, when growing an…
D.W.
  • 6,668
4
votes
1 answer

In XGboost are weights estimated for each sample and then averaged

The weights in XGBoost are determined by gradient boosting. So, each sample gets a weight and as each leaf has multiple samples, initially each leaf has multiple weights. But, as a single weight is needed for each leaf (based on the below thread,…
tjt
  • 817
4
votes
1 answer

How does regularization part of w help in XGBoost

In the regularization part of XGBoost objective function, it contains gammaT and also lambdasquare(W). I understand gamma is the minimum node split criteria and T is number of leaves and regularizing them lead to a simpler model (not many…
tjt
  • 817
4
votes
1 answer

When does XGBoost start (and stop) to create a new tree?

I am learning XGBoost these days. I read a lot of slides/tutorials. However, most of them focus on how a tree grows internally, e.g., how the split is made. I couldn't figure out how the 'forest' grows in xgboost --- in the end, it is a forest-based…
user152503
  • 1,489
4
votes
0 answers

Questions about the xgb.cv and GridSearchCV

I am using the xgboost package in python to do predictive modelling problems. I encounter the following two questions: First, I've specified a set of values for those hyper parameters and run cross validation params1 =…
KevinKim
  • 6,899
4
votes
2 answers

Does a smaller learning rate help performance of a Gradient Boosting Regressor?

This page shows how a learning rate of less than 1.0 can improve the performance of a Gradient Boosting Classifier, in sklearn. It shows that over many trees, a smaller learning rate plateaus at a lower…
Fairly Nerdy
  • 1,167
  • 1
  • 12
  • 16
3
votes
1 answer

How can I do logistic correction for boosting

Can anyone tell me if logistic correction is the best method to correct the probability of gradient boosting machine? If so, how can I do it?
user22062
  • 1,419
  • 3
  • 17
  • 22
3
votes
1 answer

XGBoost and how to input feature interactions

I have a dataset with 3 features of interest. Within the boosting (and specifically XGBoost) framework, if I want to account for all possible interactions between the features, does this need to be included in the input matrix X? Or would I simply…
pdhami
  • 335
1
2 3 4