p-values for feature selection

Asked Nov 19 '15 at 08:16

Active Feb 12 '18 at 22:18

Viewed 722 times

I am doing multiple regression analysis, in which i want to eliminate some of the insignificant features. In most of the machine learning books subset selection, shrinkage methods or PCA is used for reducing number of feature. Why p-values are not commonly used for feature selection?

edited Feb 12 '18 at 22:18

kjetil b halvorsen

77,844

asked Nov 19 '15 at 08:16

Siddhesh

5

This is why. (Whether you do it in a stepwise manner or all at once doesn't change the fundamental problem.) – Stephan Kolassa Nov 19 '15 at 08:20
@Stephan : I read the answer. Does it imply p-values should never be used? – Siddhesh Nov 19 '15 at 12:40
2

No. You can use and interpret p values if you use them correctly. This is a good place to start understanding them. In your specific case, if you look at multiple models (by selecting features), the p values will not be uniformly distributed under the null hypothesis any more, so you either need to find their new distribution (e.g., through simulation) or interpret them differently. – Stephan Kolassa Nov 19 '15 at 12:45

p-values for feature selection

0 Answers0

Linked