0

I've trained a logistic regression over my data. I checked feature importance:

from matplotlib import pyplot

features = X_train.columns importance = Model.best_estimator_.coef_[0]

plt.bar(features, importance) plt.title("Feature Importance according to logistic regression") plt.ylabel("Improtance") plt.show()

enter image description here

and the importance could be seen also here

Model.best_estimator_.coef_[0]

array([1.09365005e+00, 7.50386093e-01, 4.29639078e-04, 5.99721148e-01])

My question is what is the meaning of these results? what are those numbers? I thought it was percentages before but 1.09365005e+00 means more than 100% importance, so it can't be percentages

CORy
  • 543
  • 1
    Aren't these the estimated coefficients? In what sense do you conceive of them as being "feature importance"? – whuber Jan 02 '23 at 23:23
  • 1
    These are estimated coefficients. – gunes Jan 02 '23 at 23:24
  • 1
    Because these are coefficients, their interpretation as importance is at best nuanced and at worst dubious, depending on who you ask & what your data represent. One source of caution is that the coefficients are expressed as units of the target per units of the independent variable, so comparing coefficients means comparing incomparable units of measurement. Re-expressing an IV in different units (eg. length in cm vs km) will likewise change the coefficient value. For more detail, see the "Race of Variables" section of Gary King, "How Not to Lie with Statistics" (1985). – Sycorax Jan 02 '23 at 23:31
  • More information in this search: https://stats.stackexchange.com/search?q=logistic+regression+importance+answers%3A1+score%3A3 and https://stats.stackexchange.com/search?tab=votes&q=interpret%20coefficient%20logistic%20answers%3a1 – Sycorax Jan 02 '23 at 23:33

0 Answers0