Negative binomial model vs zero-inflated negative binomial - theoretical justifications

Question

I have a count variable that I would like to predict using a categorical variable (it has 4 levels). I would like to decide whether I should use Poisson, negative binomial, or zero-inflated negative binomial (ZINB) regression which seem to be the most common choices to deal with count outcome variables.

I generated three models (a Poisson, an NB, and a ZINB model) and contrasted their AIC values. The ZINB model has the lowest AIC (which is weird because the number of 0s is not awfully high) - but I started to wonder: should I even consider using ZINB if I have NO theoretical reason to assume that 0s can come from two different sources? As far as I understand, that's the situation zero-inflated regression is modelling, but in my case the outcome is the frequency of attentional lapses in a given amount of time, and I don't think there would be people in there who could never have any lapses, in addition to people who could, but didn't happen to have any. Am I safe to use negative binomial if this assumption of ZINB doesn't seem to hold?

EDIT: A more general version of this same question (as suggested by Bence in the comments): "if I have an independent, identically distributed sample from an unknown distribution, is it acceptable to fit a distribution to it that I can find no theoretical justification for if this distribution gives a better fit in terms of AIC than another distribution that I can justify?"

I would rephrase your question in a more general form: if I have an independent, identically distributed sample from an unknown distribution, is it acceptable to fit a distribution to it that I can find no theoretical justification for if this distribution gives a better fit in terms of AIC than another distribution that I can justify? (I too would like to know what the accepted answer for this is.) — Bence Mélykúti, Aug 29 '18 at 17:05
I edited my original post to include your question too, thanks. Just to clarify, what exactly do you mean by "identically distributed sample"? — MGy, Aug 29 '18 at 20:59
Is it more familiar as 'i.i.d.'? I've just written it out for easier comprehension: https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables — Bence Mélykúti, Aug 29 '18 at 21:09

Negative binomial model vs zero-inflated negative binomial - theoretical justifications

0 Answers0

Linked