Suppose I have a multi-level categorical variable like color (say, with 7 levels). Some software libraries only allow numeric matrices to train models, so we need to encode the color variable.
In this case we would use [levels - 1] = 6 columns to do this. But what happens with models like random forests or gradient boosting, where the feature space is also sampled? In this case, it feels like I'm losing information, because the probability of a given 'variable level' being selected will decrease.
I hope someone can help me to understand this better.