Why am I not able to fit a zero inflated poisson distribution?

Question

Following what is suggested here https://stackoverflow.com/questions/7157158/fitting-a-zero-inflated-poisson-distribution-in-r

> stat
    x    N
 1: 0  478
 2: 1  901
 3: 2 1101
 4: 3  873
 5: 4  583
 6: 5  250
 7: 6   97
 8: 7   31
 9: 8   10
10: 9    2

# vect <- rep(stat$x, stat$N)
count <- c(478, 901, 1101, 873, 583, 250, 97, 31, 10, 2)
vect <- rep(0:9, count)
library(fitdistrplus)
library(gamlss)

fit <- fitdist(vect, "ZIP", start=list(mu=2.4, sigma=0.1))
# mu = 2.64, sigma = -0.14, log = TRUE): sigma must be between 0 and 1

The plots are from regular poisson fit. As I see there are more zeroes, and gof is 0.00087, so I hope ZIP could help.

enter image description here

However, if I use zeroinfl from pscl

summary(zeroinfl(x ~ 1, dist="poisson", data=data.frame(x=vect))

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-1.4945 -0.8607 -0.2269  0.4069  4.2096 

Count model coefficients (poisson with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.88120    0.01134   77.73   <2e-16 ***

Zero-inflation model coefficients (binomial with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -3.7452     0.2597  -14.42   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Number of iterations in BFGS optimization: 10 
Log-likelihood: -7853 on 2 Df

mu = exp(0.8812) = 2.41
zero = logit(-3.7452, inverse=T)=0.02308537

As a guess, it looks like zeros are not inflated. Try a regular Poisson. — Peter Flom, Jan 02 '14 at 13:25
There does not appear to be any considerable zero inflation to me. The regular Poisson seems to fit quite well. Just as a note, a gof test with a sample size as large as yours will almost always reject, since you have so much power in the test you will detect even very minute differences in the distributions being compared. — Underminer, Jan 02 '14 at 14:15
@Underminer really? I thought only normality test is not useful for large dataset — colinfang, Jan 02 '14 at 14:30
They represent the same type of test. The p-value still represents the same thing. Minor deviations from the theoretical distribution (whether it be Normal or Poisson) will still be detected. And to extend this further, increasing the sample size will always increase the probability to reject the null hypothesis if the null hypothesis is not true (even if you are very close to the null). — Underminer, Jan 02 '14 at 14:49
Your plots show that there are more zeros in the data than in the theoretical distribution. Does seem surprising that fitdist/ZIP allow the zero-inflation parameter to go negative, but the results seem sensible. (In contrast, pscl::zeroinfl fits the zero-inflation probability on the logit scale, so it can't go negative. — Ben Bolker, Jan 02 '14 at 15:16
About normality testing and sample size: "Is normality testing 'essentially useless'?" — Nick Stauner, Mar 10 '14 at 02:37

score 0 · Accepted Answer · answered Jan 04 '14 at 12:45

This works:

fit <- fitdist(vect, "ZIP", start=list(mu=2.4, sigma=0.1),
      lower=c(-Inf, 0.001), upper=c(Inf, 1), optim.method="L-BFGS-B")

which gives a likelihood of -7853.122

So @Ben Bolker is correct.

It doesn't work even if I specify lower to be 0, as it would try to evaluate at sigma = 0, which is not supported for ZIP.

Why am I not able to fit a zero inflated poisson distribution?

1 Answers1