How is the feature interaction different in XGBoost vs. a fully connected neural network

Question

When you know your features could interact with each other, will you choose XGBoost or NN-based models? My friend is training with an XGBoost, and he manually adds interacted features (X1 * X2) as new features to the model even though the XGBoost can learn interactions by itself. He said that adding in interactions that we are aware of a priori could still help the model to learn sometimes. Is that true? and why don't we just use a fully connected NN model? Won't that do the same thing for you automatically, plus you don't need to manually add all interacted features?

Related, not a duplicate: https://stats.stackexchange.com/questions/574017/xgboost-and-how-to-input-feature-interactions — Sycorax, May 19 '22 at 14:57
Is it fair to say that the tree depth limits the feature interaction it could learn? — YY_H, May 19 '22 at 15:35
Yes. To see why, consider a 2-dimensional problem where the decision boundary is diagonal. It takes a number of axis-aligned splits to approximate a diagonal to a specific level of precision, but if we can pre-compute the diagonal, it only takes 1 split. Here's another example in a slightly different setting, but the core idea is the same. https://stats.stackexchange.com/questions/164048/can-a-random-forest-be-used-for-feature-selection-in-multiple-linear-regression/164068#164068 — Sycorax, May 19 '22 at 15:36
Tree depth applies to one estimator while boosting so that can be overcome by doing more rounds of boosting and getting a more complex model. As @Sycorax states, the interaction can allow the model to split once. With that said, if you know before hand then you should try both with and without interaction and go with what performs best. I typically see adding the interactions to boosted trees do very little towards the bottom line. — Tylerr, May 19 '22 at 15:41
Is manually adding in interacted features a model-independent feature engineering technique/custom? — YY_H, May 20 '22 at 03:32
Not sure what you mean by 'custom' but it is feature engineering. And to answer the piece about NNs: yes, NNs can learn the feature interaction but I would say it is similar to trees in that if you know it should exist then you could try to add them and see what happens. In general though, for me the difference in choosing a NN-type method or trees comes down to the type of problem/data and not about the interactions. If you have standard tabular data then go with trees, if you have nlp or computer vision then you go NNs. Just a rule of thumb though! — Tylerr, May 20 '22 at 20:31

How is the feature interaction different in XGBoost vs. a fully connected neural network

0 Answers0