0

I have a dataset that contains the number of devices used to express casual relations in a text. I've constructed a Negative Binomial Regression to estimate the effect of Educational Level (3 levels) on the number of casual devices. However, I need to control for the fact that each text has different amount of words. If I am using a negative binomial model (glm.nb), should the total_words be included as offset in the model? I have included it as fixed effect, treated as a covariate.

Can anyone shed some light into how to choose between these two options? Thank you in advance.

1 Answers1

1

It sounds a good solution to include (logarithm of) number of words as an offset in the model. There are similar questions here, have a look at