I'm trying out some different generative models and I want to evaluate them by holding out some documents, training the models the other documents, and then calculating the perplexity on the held out documents.
However, I'm not sure what to do when I have a word in the held out documents that is not in the documents that were trained on. The model with give that word a probability of zero. Is there a standard/good procedure for dealing with this?