I have a theoretical question regarding the use of zero-inflated models. There are similar questions here and here, but neither answer set seems to deal with the theoretical question I am asking.
I understand that the theoretical use of zero-inflated models is that when you have a bunch of zeros in count data, it is sometimes useful to conceive of those zeros coming from two different generating processes: (a) one where there just "happened to be" zeros generated as a function of the normal count process, and (b) one where there was nothing in the count process that could have produced a count, i.e. it is impossible to have a count. The example Paul Allison gives here: "Of course, there are certainly situations where a zero-inflated model makes sense from the point of view of theory or common sense. For example, if the dependent variable is number of children ever born to a sample of $50$-year-old women, it is reasonable to suppose that some women are biologically sterile."
My trouble with this common explanation is that I can't identify when to draw the line and use a not-zero-inflated model even when there are many zeros. For any given count variable, I can imagine a scenario likely to have been present in the data where it would be functionally impossible to get a count. With a big enough dataset, it is likely that you're going to have observations where the conditions were such that it was functionally impossible to have counts. In ecological studies where you're predicting the number of a given species found in a given location, is it not always the case that a zero might be a function of ecological conditions that make it impossible for the species to have been observed? In sociological studies where you're predicting the number of crimes someone has committed, is it not always the case that a zero could be a function of someone not having the socioeconomic or psychological conditions that would make crime functionally possible?
I'm particularly getting mixed up because theoretically, in these studies, we're predicting the presence or abundance of some phenomena, but deciding on the model based on whether we think the phenomena is sometimes going to be basically impossible for reasons outside of the predictors we've collected. But if I really think about it, we can probably come up with a not-so-contrived example of how counts would be impossible for a great number of count distributions with a lot of zeros in them, which seems like it would lead us to saying we should always be theoretically justified in using zero-inflated models for count data with a lot of zeros, right? Any resources or clarifying responses would be greatly appreciated.
EDIT: I think re-reading all of everyone's responses, I'm realizing my cognitive block is that I don't particularly conceptualize the excess zero generating process as being that usefully separated from the count generating process. For the majority of use cases I can think of, any conditions producing an excess zero aren't necessarily deterministic, in that even if they dramatically reduce the probability of a non-zero count, they don't actually eliminate the possibility of a count. And for zeros that were generated by the count process, I don't know where to draw the line conceptually to say that an observation with value 0, where the probability of it being 0 is really high, is not an "excess" zero. But I take @Adrian's overall feedback that this is an example of where there is some epistemological fun here, and it's not a matter of "correct" as much as it's about how we want to model the phenomena.
