1

I use the Chi-square test for feature selection. I use it only when all entries in the contingency table are greater then 5.

Is that the correct approach statistically?

What happens for example, if there's a feature that appears 1000 times only in positive examples? It seems that it should pass the test. Am I using it wrong?

Roy
  • 829

0 Answers0