I was reviewing a ML notebook when part of the EDA looks at the cardinality of categorical variables. As the notebook was prepared there was no strange result, but what if an attribute has a very high cardinality. For example, if a Dataset of 10000 rows has a cardinality of 5000, that is 50% of the size of the data. Searching for stack overflow they propose different solutions to vectorise it.
https://stackoverflow.com/questions/33043222/features-with-high-cardinality-how-to-vectorize-them
But in my opinion this attribute should be discarded because it is not useful to predict anything. is this a false assumption ? is there a rule ?