In text mining, if we've computed n-gram counts, for say $n=1\ldots4$, is there a principled way to combine them, other than just concatenating the $tf-idf$ matrices for each one? (equivalent to an unweighted sum of kernels if we were to construct kernel matrices for each one). For example, google's n-gram viewer:
http://books.google.com/ngrams/datasets
shows that they calculated from unigrams up to 5-grams, but they don't say how they combine them.