I have used glm() to model some data I have. The code looks like the following:
for(ddm_idx in 1:90) {
for(ppm_idx in 1:90) {
mdfit <- glm(cuse[[4]] ~ cuse_ddm[[3 + ddm_idx]] + cuse_ddm[[3 + ddm_idx]]^2 +
cuse_ppm[[3 + ppm_idx]] + cuse_ppm[[3 + ppm_idx]]^2 +
cuse_ddm[[3 + ddm_idx]]*cuse_ppm[[3 + ppm_idx]],
family=poisson(link=log))
mdfit_dev[ddm_idx, ppm_idx] <- deviance(mdfit)
}
}
It turns out that for each "case", I have about 90 different data points for ddm and ppm and so that's why I have the for loop run twice. I know this is correct because a post-doc in stats also ran the same in MATLAB and got the same results.
However, my next task to to use zero inflated Poisson distribution as I have a lot of zeros in my dataset. Some of these zeros are "true" zeros and some of them false.
How can I modify my code to use glm() for this distribution?
cuse[[4]]are the number of cases per week. There are 240 weeks. The number of cases are reported by someone. In some weeks there were indeed 0 cases. In other weeks, the person was too lazy to count or did not show up to work or forgot to count it for that week. This is a false zero. – masfenix Mar 24 '14 at 03:49NAand 0 are both coded as 0. – Glen_b Mar 24 '14 at 03:58