I'm implementing sentiment analysis on the set of user comments. All comments are on the same object. At the moment I decided to have three classes - negative, neutral and positive. I got test array of 1500 comments with marked classes. Tried to use SVM for classification on binary feature vectors in which each element refers to the presence of some word in the comment. I got maximum accuracy of 60% correct classes. Known researches had 80% and better accuracy, but they was done on English texts.
One of the problems - numerous errors in the comments, spelling and grammar. Also the Russian language is more complex than English.
I would appreciate advice of any kind. Are there any good tools for the analysis of the Russian language? Maybe SVM isn't the right choice, are there any better algorithms for my case? Or maybe i must choose the more efficient feature space?