0

In the formula of finding the perplexity of a corpus, why is it normalized based on the total number of words?

Why shouldn't be normalized based on number of sentences? If # of sentences is used for normalization, is it valid computation?

Perplexity(C)=N-th root of 1/P(S1,S2..Sn) where N = number of words in the corpus

---- reference:

https://stats.stackexchange.com/a/143638/287956

0 Answers0