I've been using a negative binomial model to compare the number of particles in the ocean of different sizes to their abundance.
My gam, in r code looks like this
gam(TotalParticles ~ log(lb), offset = log(binsize * vol), data = df, family = "nb")
I'm using gam, rather than glm, because the former can handle negative binomial regressions and poisson regressions with the same syntax and I was comparing the two at one point
Essentially I'm interested in the slope of the relationship between the log of size lb and the log of particle number Total Paricles. The particle sizes have been binned by size, and so I normalize to the width of that bin. The particls are also collected in different volumes of water, so I also normalize to that volume vol.
I'm trying to express this in math and words in a corresponding manuscript. Right now, based on this post, I'm writing the equation as:
$$ ln(\frac{E(Total\,Particles)}{Volume *Binsize}) = b_0 + b_1\,ln(Size) $$
My co-authors keep asking me what E means in this context. My understanding is that we are predicting an expected "negative binomial" distribution of total particle numbers from size.
So far I have written:
The term on the left describes the expected volume and bin-size normalized count data, assuming a negative binomial distribution of residuals, with E referring to the conditional expectation of total particle numbers, assuming a negative binomial distribution.
I have two questions:
Does my equation actually correctly represent the linear model written in code? If not, is there a better way of writing it?
Is there a better way I can express in words what is going on on the left hand side of the equation?