-1

The project I'm working on uses a lot of different variables to predict sales. The best model, in terms of mean average squared error is an ensemble Model which is a combination of a regression model and a probability tree model.

How do I interpret this model? I know it's the average of both models and I know how to interpret the models separately but how do I interpret the Ensemble model overall?

Thank you.

  • I can't really see from the question what exactly is unclear to you. As you write, you know how the single models can be interpreted and you know that the ensembles prediction is their average. – deemel Jun 17 '18 at 09:20
  • Hi yes, but to interpret the whole ensemble model do I just interpret each individual mode and then say that the whole model is the average? Or would one model be useful than the other? – user9568744 Jun 17 '18 at 09:27
  • 1
    Interpret the separate models, describe how you trained the weights to average the predictions together, then present some model validation statistics. You're done! – AdamO Jun 17 '18 at 13:23

3 Answers3

3

The sole purpose of doing an ensemble is to increase predictive performance. There is no point in trying to interpret an ensemble model as I don't see it answering any question pertaining to the data analysis that you have done. If you go to the link you will understand that interpretation is synonymous with finding meaning in something.

The best model, in terms of mean average squared error is an ensemble Model

What you have is an ensemble output of the outputs of two different models. This has only been done to improve performance and get better predictions.

You can ask yourself : Trying to interpret an average of two quantities going to answer any necessary question? Is it going to explain any relationship among variables? I believe not.

naive
  • 1,039
  • 1
  • 10
  • 14
  • Hi, so it wouldn't be necessary to interpret each model used within the ensemble model? – user9568744 Jun 18 '18 at 23:09
  • if you can, then you must interpret the separate models. Necessity depends on what you are trying to accomplish. I quote @AdamO from the comment on the question : interpret the separate models, describe how you trained the weights to average the predictions together then present some model validation statistics. That's all there is. – naive Jun 19 '18 at 05:15
1

If you want something interpretable, then you shouldn't use an ensemble model.

As others have said, and as you know, it's the average of things. But those things may not be compatible in terms that are interpretable.

Peter Flom
  • 119,535
  • 36
  • 175
  • 383
0

Explainability is a hot research area. Recently, newer tools have been developed to explain tree ensemble models using a handful of human understandable rules. Here are a few options for explaining tree ensemble models, that you can try:

You can use TE2Rules (Tree Ensembles to Rules) to extract human understandable rules to explain a scikit tree ensemble (like GradientBoostingClassifier). It provides levers to control interpretability, fidelity and run time budget to extract useful explanations. Rules extracted by TE2Rules are guaranteed to closely approximate the tree ensemble, by considering the joint interactions of multiple trees in the ensemble.

Another, alternative is SkopeRules, which is a part of scikit-contrib and RuleFit. SkopeRules extract rules from individual trees in the ensemble and filters good rules with high precision/recall across the whole ensemble. This is often quick, but may not represent the ensemble well enough.

For developers who work in R, InTrees package is a good option.

References:

TE2Rules: You can find the code: https://github.com/groshanlal/TE2Rules and documentation: https://te2rules.readthedocs.io/en/latest/ here.

SkopeRules: You can find the code: https://github.com/scikit-learn-contrib/skope-rules here.

Intrees: https://cran.r-project.org/web/packages/inTrees/index.html

Disclosure: I'm one of the core developers of TE2Rules.

  • 1
    Hi, welcome to CV! Please add a reference for your links whenever possible, in case they die in the future. Thanks! – Antoine Jul 25 '22 at 22:03