1

This is my first time asking a question here, I have two sets of count data that I want to compare to one another. One set has an excess amount of zeros, while the other does not. My first idea was to use a Zero-Inflated Poisson regression for the first set of data with the excess amount of zeros within it, and then either a normal or quasi Poisson depending on the second set depending upon the existence of over-dispersion in the data. Is this a good approach or would it be better to use one analysis for both sets of data?
Thanks in advance.

  • What do you mean by "compare to one another"? Do you want to look at shape of distribution, measures of location/central tendency? Particular quantiles? Or what? – Peter Flom Mar 09 '16 at 20:06
  • I am sorry I want to determine if similar associations exist between the data. I am looking at two similar diseases and comparing it to the same environmental features. – B. Ashraf Mar 10 '16 at 05:35

1 Answers1

0

So, from your comments, you have two diseases and you want to compare how they relate to a bunch of environmental variables.

If this is right, then you can simply use logistic regression on both data sets. There is no need for a count model because the dependent variable is not a count, it's a probability.

Peter Flom
  • 119,535
  • 36
  • 175
  • 383
  • Your last statement is mysterious. The OP clearly indicates the dependent data are counts. In what sense, then, are you claiming these are "probabilities"? – whuber Mar 10 '16 at 13:55
  • Reading the OP's reply to my comment on the post, it's clear that the question was confused. OP does not want to compare the distributions but the regressions. – Peter Flom Mar 11 '16 at 02:55