0

I have a pricing table for the cost to ship a package. The cost is the output, and the factors that determine the price are this weight, the zone (distance travelled) and the service base (ground, express, overnight, etc.) I want to be able to say, for example, the service base is responsible for 60% of the price, 30% is weight, and 10% is attributed to zone. How can I calculate these percentages?

1 Answers1

0

For regression you could try 'permutation feature importance'. Feature importance shows, which variable were more or less important for a model to predict a value. The corresponding model summary should also provide metrics for each explanatory variable - among others e.g. their statistical significance as indicated by its p-value. However, all those methods proposed are not that straight-forward as you would like them to be, according to your question.

Maybe also take a look into: Why does Random Forest variable importance not sum to 100%? and https://machinelearningmastery.com/feature-selection-for-regression-data/

In case you have many different features you could also take a look into 'factor analysis' and the amount of variance that is described by each factor and the actual features that are 'covered' by respective factors (see 'bi-plot').

Hope this helps.