According to wikipedia CHAID is popular for modeling reponses in direct marketing (and I have seen it come up several times in this context). Does anyone know what it is that makes it suitable/preferred for this type of analysis? What benefit does it have compared to other similar approaches?
-
3That is just guessing, but couldn't it be used because marketers do not necessarily want to exactly model responses but also understand what's going on and derive insights from it ? Let's say you train a neural network, it may model very well your data but you won't understand very much your users. With a decision tree the performance of your model might be lower, but it is easier to understand what is really going on and take actions based on that (for instance you see a node if something happens then your revenue drops 50%, you can try to tackle the issue of that thing happening). – Pholochtairze Jan 27 '16 at 13:17
-
2My view is that the DM industry relies much more heavily on CART than CHAID. CHAID is an older method which Breiman, et al, pretty much blew away with their 80s book about CART. Regardless, DM was a hot field around the time Breiman's book came out with huge employment growth rates not much lower than we're seeing in the digital/data science space today. Things like one-to-one marketing, logistic regression and CART were "fad" solutions or hammers for which everything was a nail. Once growth rates subsided in the later 90s, DM kind of ossified and closed in on itself, changing little. – user78229 Jan 27 '16 at 13:35
-
1I would be interested in knowing too. My guess is because CHAID does multilevel splits (which I believe is easier to interpret than the standard binary splitting) – seanv507 Jan 27 '16 at 14:36
-
Very simple answer, CHAID is highly interpretable vs (logistic regression or random forest) or any other black box machine learning algorithms. No one in marketing like black box methods as it simply does not aid in decision making/strategy development. – forecaster Apr 27 '16 at 04:11
2 Answers
I think there is no inherent reason why CHAID is more suitable than any other constant-fit tree (e.g., CART/RPart, QUESt, CTree, etc.) for direct marketing. I think it has been adopted more quickly in that community, covered in textbooks, and importantly is available in SPSS. But I don't think that this should stop you from either considering other tree methods or non-tree methods.
- 15,515
- 2
- 38
- 62
An aspect that hasn't been covered is that (at least in the SPSS Modeler implementation) CHAID trains non-binary trees (the data can be split into more than two groups in any step) which leads to broader trees with fewer levels compared to C&RT, C5.0 or something similar, which at least in some contexts is better for explainability. I use CHAID in a different context than DM and have found that the explanation for how the model assigns a classification comes out in a way that is more intuitive to explain to non-technical people; I can imagine that this is a valuable attribute in a marketing context.
- 1,409