I have the following data
str(data)
'data.frame': 768 obs. of 5 variables:
$ PIANTA : chr "C-1-R1-1" "C-1-R1-1" "C-1-R1-2" "C-1-R1-2" ...
$ Trattamento: Factor w/ 4 levels "Controllo","Lidar",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Blocco : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ Replica : chr "R1" "R1" "R1" "R1" ...
$ Risposta : num 0 1 0 1 0 3 2 3 2 4 ...
I have a total of 768 observations. I would like to test whether the treatment (Trattamento) has a significant effect with respect to my response variable (Risposta). The response variable is numeric (ranging from 0 to 9) and assumes the value 0 for more than 400 observations. This is my frequency table:
data counts
1 0 478
2 1 107
3 2 89
4 3 50
5 4 21
6 5 13
7 6 3
8 7 5
9 8 1
10 9 1
Therefore I opted to use a zero-inflated poisson model using the following R code:
model1 <- zeroinfl(Risposta ~ Trattamento | Trattamento, data = data, count.dist = "poisson")
Call:
zeroinfl(formula = Risposta ~ Trattamento | Trattamento, data = data, count.dist = "poisson")
Pearson residuals:
Min 1Q Median 3Q Max
-0.8273 -0.6938 -0.4883 0.4219 6.2762
Count model coefficients (poisson with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.76103 0.07557 10.071 <2e-16 ***
TrattamentoLidar -0.11973 0.13009 -0.920 0.357
TrattamentoRecupero -0.18094 0.14447 -1.252 0.210
TrattamentoStandard -0.19740 0.12201 -1.618 0.106
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.3894 0.1778 -2.190 0.028523 *
TrattamentoLidar 0.9323 0.2518 3.703 0.000213 ***
TrattamentoRecupero 1.2351 0.2599 4.752 2.01e-06 ***
TrattamentoStandard 0.3500 0.2578 1.358 0.174542
Signif. codes: 0 '*' 0.001 '' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Number of iterations in BFGS optimization: 15
Log-likelihood: -939.4 on 8 Df
This is my intepretation:
- Count model coefficients: there is no statistically significant differences on count data among the different treatments
- Zero-inflation model coefficients: there are statistically significant differences on the probability of finding a zero for the treatments "Lidar" and "Recupero" respect to my intercept "control".
Is this interpretation correct? Am I missing something related to the goodness of using this model respect to the type of data I have?