I'm going through Chapter 8 of "Introduction to Statistical learning" which introduces decision trees. My question is specific to the three approaches to pruning a decision tree (i.e., classification error rate, Gini Index, and cross-entropy).
With regard to building classification trees, the chapter states that "classification error is not sufficiently sensitive enough for tree-growing, and in practice, the Gini Index and cross-entropy are preferred".
However, it also states that "Any of these three approaches might be used when pruning the tree, but the classification error rate is preferable if prediction accuracy of the final pruned tree is the goal."
There are two questions with regard to this:
- Given that classification error rate is not sensitive enough, why should it be used, over Gini Index and cross-entropy, if prediction accuracy is the goal? What advantage does it have over Gini Index and cross-entropy?
- If classification error rate is preferred, in what instances would we use the Gini Index and cross-entropy when pruning a decision tree?