What's the correct way to deal with categorical variables in packages like sklearn's RF and xgboost?
Is there any cons of treating the variables are continuous? E.g. encode class A as 1, class B as 2, class C as 3?
What's the correct way to deal with categorical variables in packages like sklearn's RF and xgboost?
Is there any cons of treating the variables are continuous? E.g. encode class A as 1, class B as 2, class C as 3?
- make data into category
- make it sparse?
– jxieeducation Dec 03 '15 at 20:47