I would like to ask if I am right to use a Gaussian family in my GAM analysis. I have some count data of the number of animals captured per 100 traps, which is not normally distributed. Then I did a log transformation and the shape is better, but it's still not normally distributed. I have some zeros in my data. I have continuous values after the transformation, and I get an error if I run the GAM with Poisson family and model selection like so:
raveco1p.gam <-gam(logRAVtrapsb ~ 1 + region + logarea + Type, data=dd, family = "poisson")
There were 18 warnings (use warnings() to see them)
> summary(raveco1p.gam)
Family: Poisson
Link function: log
Formula:
logRAVtrapsb ~ 1 + region + logarea + Type
Parametric coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.9693 1.7281 -2.297 0.0216 *
region1 1.2937 0.4417 2.929 0.0034 **
logarea 0.4033 0.1688 2.390 0.0169 *
Type1 1.5471 1.1976 1.292 0.1964
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.445 Deviance explained = 42.5%
UBRE = -0.11976 Scale est. = 1 n = 44
> options(na.action = "na.fail")
> dredge(raveco1p.gam)
Fixed term is "(Intercept)"
Global model call: gam(formula = logRAVtrapsb ~ 1 + region + logarea + Type, family = "poisson",
data = dd)
---
Model selection table
(Intrc) logar regin Type df logLik AICc
1 -0.5112 0 -Inf Inf
2 -1.3000 0.2169 2 -Inf Inf
3 -1.2400 + 1 -Inf Inf
4 -1.8450 0.1977 + 3 -Inf Inf
5 0.7358 + 1 -Inf Inf
6 -1.7000 0.2547 + 3 -Inf Inf
7 -0.2002 + + 3 -Inf Inf
8 -3.9690 0.4033 + + 4 -Inf Inf
Models ranked by AICc(x)
There were 50 or more warnings (use warnings() to see the first 50)
When I used Gaussian family, there were no problems. Is this the correct approach?