I am looking at this post How to find the perplexity of a corpus. I understand the whole post, but
the probability of a sentence appear in a corpus, in a unigram model, is given by p(s)=∏ni=1p(wi), where p(wi) is the probability of the word wi occurs.
For a large corpus, even if I only calculate the probability of a sentence by using ∏ for each word in the sentence, I still get a 0 proabibilty which causes error in the following log2 calculation. Can someone help me?