I've been trying to classify multi-label texts with different classification algorithms.
I get some pretty good results with linear kernel SVM and with the rest of the kernels the result is not good. I understand that this happens because usually the classification of texts space is linear.
When I use Random Forest tests, the results are much worse but acceptable. Some labels are correctly classified, but many are not classified.
Finally I used a Multinomial Naïve Bayes and the results are very bad, in fact the classifier does not classify anything.
This is normal? Is there any reason for these very poor results with Naïve Bayes?
I binarizo entries using scikit learn and perform counting each with countvectorizer and tdiftTransform. Tokenizo and stemming previously performed.
Tags are about a thousand.
For performance measures are used as precision, recall and F1.
– Blunt Jun 26 '15 at 09:53