If I were to fit an example GAM model using mgcv:
m1 <- gam(cases~s(salary,bs="cs",k=50)+s(age,bs="cs",k=50)+
s(density,bs="cs",k=50)+
s(longitude,latitude,bs="ds",k=100),
data=example,family=poisson(),method="REML")
And if
summary(m1)shows all smooth terms are significant- adjusted R-squared is 0.999
- deviance explained is 99%
gam.check(m1)shows allkare penalized appropriately- the QQ plot looks to follow the line (bar a few points straying off on the tails)
- resids vs linear predictor plot doesn't look like it's trending
- histogram of residuals doesn't have too fat tails and follows closely to a gaussian distribution
Given all of this, why is it that a Chi Squared test on the deviance residuals suggest that the model fit is bad?
pchisq(m1$deviance,df=m1$df.residual,lower.tail=F)
which gives something close to zero (e.g. 1.234e-20).
summary(m1):
Parametric coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.697446 0.003262 698.2 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df Chi.sq p-value
s(salary) 19.132 49.000 5793.49 <2e-16 ***
s(age) 29.933 49.000 393.71 <2e-16 ***
s(density) 36.198 49.000 354.83 <2e-16 ***
s(longitude,latitude) 88.230 96.095 2545.63 <2e-16 ***
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.999 Deviance explained = 99.6%
-REML = 7402.2 Scale est. = 1 n = 2155
Furthermore, should there be a "next course of action"? Like adding adding in variation of a covariate across space (e.g. ti(longitude, latitude, age)). If so, should this be done for each and every pair of covariates (salary,density etc.).
summary()? – usεr11852 May 07 '21 at 08:36summary(). I have seen that thread before, they suggest that a1-pchisqneeds to be done or, equivalently, setlower.tail=FALSE. Yet my p-value is still close to zero. Are you/they are suggesting thatpchisq(m1$deviance,df=m1$df.residual,lower.tail=F)is wrong, and that I need to get the difference between null deviance and model deviance? If so, where do I get the null deviance? The modelm1is the first model that I am using. – Knovolt May 07 '21 at 13:15m1is no better than the null model (so I interpret this as not exactly a good fit). If so, why does the bullet points outlined above suggest that the model looks good - seems contradictory? – Knovolt May 07 '21 at 13:18m0 <- glm(cases~1,data=cases, family=poisson). (Also, reasonably we probably want toMLto our GAM rather thanREMLto make the comparison fair). Then we just dopchisq(deviance(m0) - deviance(m1), df=m0$df.residual - m1$df.residual, lower.tail=FALSE). Notice that the latest result being "zero" is "good thing"; the p-value is exceptional small, so we can reject the null hypothesis that the two models are equal. – usεr11852 May 07 '21 at 22:13