Looking at the classic statistical approaches to natural language processing (e.g. tagging, parsing, etc.), I see that they are mostly generative models: n-gram models, Naive Bayes classifiers, hidden Markov models, probabilistic context‐free grammars, IBM machine translation alignment models and so on.
At the opposite, more recent models tend to be discriminative: SVM, conditional log-linear model, conditional random fields, etc.
Is there any reason behind that trend?