0

I've been trying to wrap my head around this but I couldn't find a clear answer, is there a difference between N-grams and N-gram models?

From what I understand, N-grams are strictly only splitting a sentence by N-1 number of previous words with each word in the sentence and does not have anything to do with probabilities. So I would have something like this:

The boxer loves Mary. 

Unigram = "The","boxer","loves","Mary"

Bigrams = "The boxer", "boxer loves","loves Mary"

Trigrams, etc.

What I'm confused with is that I see N-grams closely used with probabilities (aka N-gram models) and people use them interchangeably.

Let me know, thanks.

robjob27
  • 101
  • 1
  • 1
    When analyzing big corpora you will observe that the probability of occurring two units is higher than others like the chunks "as" and "far" in N-grams like: as far as I know, as far as I can see, etc. Having said that we can use this data in NLP, and computational linguistics. – Andrew Ravus Mar 16 '16 at 17:00
  • 1
    You're right that N-grams don't have anything fundamental to do with probability. But they're so often used in statistical work, that people speak loosely – Colin Fine Mar 16 '16 at 18:29

0 Answers0