I need help thinking about and identifying the kind of regression analysis that would be appropriate for this problem. Nothing I've discovered so far seems quite right. Referrals to articles or examples would be helpful. Thank you.
The data look like this:
The data are observational.
The sampling unit is a geographic location (EDIT: let's assume units are independent); I'm just trying to understand the basic analytical problem here).
At each sampling unit, there are events of two types: the event type of interest (A) and all other types (B). EDIT: In other words, each event is a binary outcome (success, failure). The outcomes are aggregated to the to the location level (Location 1: Success 3, Failure 2. Location 2: Success 0, Failure 1. Location 3: Success 0, Failure 0. Location 4: Success 4, Failure 9 .... etc.
Often, A=0 and, somewhat less often but still frequently, A+B=0.
I am interested in testing a hypothesis about A, somehow controlling for the total count of events (A+B), so either a proportion A/(A+B) or a count model that controls for the total count.
If I understand correctly, if the counts were large, proportions could be calculated for all units, and I could do beta regression. But that definitely can't happen when the total number of events for a unit is zero.
If all I cared about was the count, I could use a ZIP or other count model (and maybe still can). But the research question regards the frequency of A relative to the total number of events.
But how to control for the total number of events? Does it just go in the predictors of a ZIP or similar model? I suspect it's more complicated than that.
It seems obvious to me that the individual events could be modeled directly using multi-level logistic regression (EDIT: or another model for clustered data), but I'm wondering if there is a simpler way to examine what I'm interested in, and I just somehow haven't seen an example of this.
Can you assume that I'm about two levels less sophisticated? How would you set up an analysis? How would you use the total count?
– Rico Mar 03 '14 at 17:16Outcome Data: Location 1: Success 0, Fail 0. Location 2: Success 1, Fail 0. Location 3: Success 1, Fail 1 ...
– Rico Mar 03 '14 at 18:23