I am trying to construct GLMs to explore the relationship between woodpecker abundance and 9 predictor variables (tree density, canopy cover, and so on). My data were collected as counts of woodpeckers, but I averaged the counts from multiple visits to each site (each site had between 1-6 visits made). So my data are not whole integers. They contain decimals, and I have a decent amount of zeros (when woodpeckers were not observed).
I am hitting problems with Poisson distribution because of the decimals. I tried adding an offset function but I still get errors in R because of the decimals:
> In dpois(y, mu, log = TRUE) : non-integer x = 0.917000...
Gamma distribution won't work because I have zeros. Is negative binomial the best bet? Maybe I am misunderstanding Neg Bin but I thought it was best for presence-absence data, and not continuous count data.
Here is the script I tried for the offset. I may be doing this wrong:
model <- glm(woodpeckercount ~ treedensity + canopy + snags +
offset(log(visits)), data=woodpeckers, family=poisson)
visits is the number of visits I made to each survey site (ranges from 1 to 6). It is the number I divided the total woodpecker count by to get the average. When doing an offset, would I use the total woodpecker count for each site as my response variable instead of the average (in other words, should I not divide woodpecker count by the number of visits, since the offset function accounts for this?). I want to be sure I am accounting for unequal sampling effort between sites.
log(E(total)) = log(visits) + x'b. If you bring the offsetlog(visits)to the left-hand side of the equation you can pull them into the log and the expectation:log(E(total/visits)) = x'b. Thus, this gives the response you want. For details see: https://stats.stackexchange.com/questions/11182/when-to-use-an-offset-in-a-poisson-regression – Achim Zeileis Mar 08 '18 at 20:27total/visits. Thus a one-unit change in the regressor leads to a relative change in the rate. – Achim Zeileis Mar 09 '18 at 03:00