I have a data set that I'd expect to follow a Poisson distribution, but it is overdispersed by about 3-fold. At the present, I'm modelling this overdispersion using something like the following code in R.
## assuming a median value of 1500
med = 1500
rawdist = rpois(1000000,med)
oDdist = rawDist + ((rawDist-med)*3)
Visually, this seems to fit my empirical data very well. If I'm happy with the fit, is there any reason that I should be doing something more complex, like using a negative binomial distribution, as described here? (If so, any pointers or links on doing so would be much appreciated).
Oh, and I'm aware that this creates a slightly jagged distribution (due to the multiplication by three), but that shouldn't matter for my application.
Update: For the sake of anyone else who searches and finds this question, here's a simple R function to model an overdispersed poisson using a negative binomial distribution. Set d to the desired mean/variance ratio:
rpois.od<-function (n, lambda,d=1) {
if (d==1)
rpois(n, lambda)
else
rnbinom(n, size=(lambda/(d-1)), mu=lambda)
}
(via the R mailing list: https://stat.ethz.ch/pipermail/r-help/2002-June/022425.html)
In addition, what if I have a repeated measures situation? When my data is continuous, I will use a generalized linear mixed model. The Gamma distribution often works well with continuous biological data, and the mixed model handles the repeated measures element. But what does one do if one has overdispersed repeated measure count data?
– Bryan Mar 24 '16 at 14:35