2

I am trying to analyze hunting harvest data with response-variable being individuals/1000 hectares and a series of explanatory variables to describe it. Response variable is continuous (fractionals), with a mean of 2.86. However, in some hunting areas, the annual harvest is zero, thus the variable includes many zeros. Ranging from 0 to 15.6 (N = 690). How do I go ahead to chose a proper probability distribution for this? I did a qqPlot (package 'Car') in R, and a simple histogram.

Further, what sort of model do you think would be most appropriate for this data? I've been thinking of GAMM's or GLMM's.

enter image description here

  • 2
  • You have count data, so some count data regression model (possibly with zero inflation) could help. But, then you must not divide by area to get the rate, but use the count itself as esponse and the area (well, its log) as an offset. Search this site for this terms if they are unknown to you, they are discussed in many posts. 2. It is not the marginal distribution of the offset that is important, so histograms/qqplots of that will not help you.
  • – kjetil b halvorsen Feb 14 '22 at 14:09
  • Looking at these plots, zero-inflated Poisson and zero-inflated negative binomial should be your initial models. – Durden Jun 24 '23 at 17:32