Is the OLRE term meaningful in the negative binomial model? + Is overdispersion in the NB model an issue?

Question

I'd like to ask three questions regarding the negative binomial (NB) regression / distribution.

The NB model with NB2 parameterization ($var(Y_{NB2}) = \mu + \frac{\mu^2}{\theta}$) is sometimes referred to as the Poisson-gamma model. It should be because the count data come from a Poisson-gamma mixture distribution – a mixture where each count is from its own Poisson distribution with its own $\lambda$ parameter and these $\lambda$’s follow a gamma distribution (e.g. here). The negative binomial model is basically a generalization of Poisson regression which loosens the restrictive assumption that $var(Y) = E(Y)$.

Now let's shift to an overdispersion, so common to Poisson models. One way to handle overdispersion is to include an observation-level random effect (OLRE). I learned that these models can be called the Poisson-lognormal mixed models. It should be for a similar reason as in the previous case: the data $Y$ are Poisson distributed, but the mean $\mu$ comes from a lognormal distribution. That should be the difference between the NB2 (Poisson-gamma) model and the Poisson-lognormal model, as pointed out here on CV SE (see a comment from Bryan posted at 17:01). (I also registered Björn's comment at 16:52 about the NB2 model also being ’secretly’ a mixed model, and I think I understand it.)

But now: What is the meaning of the OLRE term in negative binomial models? What I'm saying about the distribution – do I create a ’Poisson-gamma-lognormal mixture’? Harrison (2014) used OLRE to model overdispersion in count data. He ran three simulations, one of which was a negative binomial one. And he included the OLRE term.
The second question is related to the first. I thought the NB model is used when the equidispersion assumption of the Poisson model is not met. This suggests to me that overdispersion is no longer a problem with NB models. Also according to Ben Bolker’s GLMM FAQ (bullet number 2), dispersion is a problem only for models with fixed variance like binomial or Poisson ones. So why would Harrison fit a NB model with an OLRE to model overdispersion? Or am I missing something? Is overdispersion in the NB model really an issue? I read this post here on CV SE but I'm still not sure about it.
My last question is about the difference in NB1 and NB2 parameterization. NB2 (described above) assumes a quadratic variance-to-mean relationship while NB1 assumes a linear relationship ($var(Y_{NB1}) = \mu + \mu\theta$). When I simulate data from the NB distribution, the classic NB2 parameterization is used. How could I simulate the data with the NB1 parameterization? I work in R (RStudio) and for NB simulations I used functions stats::rnbinom() and MASS::rnegbin(), both of which use the NB2 parameterization.

Thank you in advance. (Sorry if my last question is dumb.)

Kevin Hawkshaw · Answer 1 · 2023-09-06T15:24:43.077

I see this question has been quiet for some time, so I'll chime in hopefully in a useful way, since I have similar questions to you. In my reading of the Harrison paper, I don't think he uses a negative binomial model with an OLRE term. He had the three simulations (extra Poisson noise, zero inflation, and negative binomial), and fit two models to each - one a naive Poisson model, and the other a Poisson with an OLRE term. He did verify that he could recover the parameter estimates from each simulation with a third model that matched the data generating process. Maybe that is where you are thinking he used a negative binomial with OLRE? I don't believe he did though. That was just a plain negative binomial model.

A negative binomial model with an OLRE term seems like a weird construction. As you correctly say, in a negative binomial each count comes from its own Poisson distribution, and those Poisson means are in turn gamma distributed. Adding an OLRE term would be then additionally stating that each count has its own Poisson mean, and those are normally distributed. It seems redundant. Hopefully someone with more knowledge can chime in. I'd also be curious if OLRE terms can be included in a zero-inflated context. In that case there doesn't seem to be the same redundancy in the Poisson means.

In thinking about this a bit more, there's an obvious potential practical benefit to including OLRE in a negative binomial model. Most R packages I'm aware of don't let you recover the observation level Poisson means with negative binomial, so if those are what you are interested in, then OLRE might be necessary. — Kevin Hawkshaw, Sep 07 '23 at 15:20

Is the OLRE term meaningful in the negative binomial model? + Is overdispersion in the NB model an issue?

1 Answers1