Hierarchical gam (HGAM) with 'by' ordered factor and random smooth 'fs' effect: why so wiggly?

Question

Following the same m3 model approach proposed by Gavin Simpson on my data, I compared an ordered factor model (m3_of) to a non-ordered factor model (m3). The global shapes trend to be similar; however the m3_of plot displays a marked wigglyness for both groups.

Question 1: How to explain this difference of wigglyness between both models?

Question 2: Comparing both models using itsadug::compareML(m3, m3_of), the best model is m3 (lower AIC); however, m3_of has a slightly better deviance explained (9.67% vs 9.63%), knowing that such model advantageously allow a more interpretable comparison (groupof=1 compared to the reference groupof=0). Hence, what is the more appropriate model?

Thanks for any help, advice, reference.

Data:

n = 180,000
bmk: outcome, continuous, positive
delay: predictor, continuous, positive
group: factor (n=2)
groupof: ordered factor (n=2, groupof=0 being the reference groupof)
medu: random effect (n=91 different medical units)

Models:

m3 <- bam(bmk ~
            group +
            s(delay, k=20, by = group) +
            s(delay, k=20, medu, bs = "fs"),
          data = dat, method = 'fREML', family = inverse.gaussian(link="identity"), discrete = TRUE)
m3_of <- bam(bmk ~
               groupof +
               s(delay, k=20) +
               s(delay, k=20, by = groupof) +
               s(delay, k=20, medu, bs = "fs"),
             data = dat, method = 'fREML', family = inverse.gaussian(link="identity"), discrete = TRUE)

Plots:

par(mfrow = c(1,2), cex = 1.1)
plot_smooth(m3, view="delay", plot_all="group", rm.ranef=FALSE, n.grid = 50, col=c("blue","red"),
            xlim=c(0,90), ylim=c(11.5,14.5), main = "m3")
plot_smooth(m3_of, view="delay", plot_all="groupof", rm.ranef=FALSE, n.grid = 50, col=c("blue","red"),
            xlim=c(0,90), ylim=c(11.5,14.5), main = "m3_of")

gratia::appraise(m3)

Summary(m3):

> summary(m3)
Family: inverse.gaussian 
Link function: identity
Formula:
bmk ~ group + s(delay, k = 20, by = group) + s(delay, k = 20, 
    medu, bs = "fs")
Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)

(Intercept) 11.76764    0.19840   59.31   <2e-16 ***
group1       0.32901    0.02879   11.43   <2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
                    edf   Ref.df      F  p-value

s(delay):group0   1.001    1.002  2.903   0.0882 .

s(delay):group1   1.002    1.003 17.792 2.46e-05 ***
s(delay,medu)   145.751 1626.000 10.319  < 2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) =  0.0792   Deviance explained = 9.63%
fREML = -1.8252e+05  Scale est. = 0.0089033  n = 179659
> gam.check(m3)
                  k'  edf k-index p-value
s(delay):group0   19    1    0.98    0.54
s(delay):group1   19    1    0.98    0.48
s(delay,medu)   1700  146    0.98    0.48

Summary(m3_of):

> summary(m3_of)
Family: inverse.gaussian 
Link function: identity
Formula:
bmk ~ groupof + s(delay, k = 20) + s(delay, k = 20, by = groupof) + 
    s(delay, k = 20, medu, bs = "fs")
Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)

(Intercept) 11.70146    0.14033   83.39   <2e-16 ***
groupof1     0.33219    0.02974   11.17   <2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
                      edf   Ref.df      F p-value

s(delay)            9.370   11.457  2.188 0.01100 *

s(delay):groupof1   1.004    1.007 10.200 0.00134 ** 
s(delay,medu)     176.807 1626.000 10.313 < 2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) =  0.0792   Deviance explained = 9.67%
fREML = -1.8246e+05  Scale est. = 0.0089023  n = 179659
> gam.check(m3_of)
                       k'     edf k-index p-value
s(delay)            19.00    9.37    0.98    0.69
s(delay):groupof1   19.00    1.00    0.98    0.68
s(delay,medu)     1700.00  176.81    0.98    0.68

Compare models:

itsadug::compareML(m3, m3_of)
Model m3 preferred: lower fREML score (56.679), and equal df (0.000).
-----
  Model     Score Edf Difference    Df
1 m3_of -182464.7   9
2    m3 -182521.3   9     56.679 0.000
AIC difference: -60.87, model m3 has lower AIC.

I'm not very familiar with this type of model, but (i) why bother treating group as ordinal when there are only 2 levels?, and (ii) why is this coded with an additional smooth term for delay in the second model? — mkt, May 12 '23 at 09:56
@mkt, (i) as well explained by van Rij link, for m3 "we can NOT conclude that the lines are different from each other. This is only possible when we would use difference smooths with ordered factors or binomial predictors. With the ordered factors the smooth term s(delay):groupof1 represents the difference between the groupof1 and the reference groupof0. When the smooth term is significant, the difference smooth is significantly different from zero. So that means that the two groups are different from each other." — denis, May 12 '23 at 13:27
@mkt, (ii) as explained by Gavin Simpson link, "with the ordered factor approach, the second term s(delay, k=20) is a smooth that represent the reference level (groupof0). The third term s(delay, k=20, by = groupof) has a separate difference smooth for the other level, i.e., it represents the smooth difference between the smooth for groupof1 and that of groupof0." — denis, May 12 '23 at 13:28

score 1 · Accepted Answer · answered May 13 '23 at 08:43

By removing 'k=20', the wiglyness disappears. I deduce (without being able to demonstrate it) that the ordered factor-based model as proposed here should not have too many degrees of freedom to be comparable to the corresponding model without ordered factor. Note that using itsadug::compareML, the non-ordered factor-based model (m4) has still the lower AIC.

Models:

m4 <- bam(bmk ~
            group +
            s(delay, by = group) +
            s(delay, medu, bs = "fs"),
          data = dat, method = 'fREML', family = inverse.gaussian(link="identity"), discrete = TRUE)
m4_of <- bam(bmk ~
               groupof +
               s(delay) +
               s(delay, by = groupof) +
               s(delay, medu, bs = "fs"),
             data = dat, method = 'fREML', family = inverse.gaussian(link="identity"), discrete = TRUE)

Summary(m4): without 'k=20'

> summary(m4)
Family: inverse.gaussian 
Link function: identity
Formula:
bmk ~ group + s(delay, by = group) + s(delay, medu, bs = "fs")
Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)

(Intercept) 11.77303    0.20189   58.31   <2e-16 ***
group1       0.32838    0.02879   11.41   <2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
                    edf  Ref.df      F  p-value

s(delay):group0   1.002   1.004  3.777   0.0517 .

s(delay):group1   1.003   1.006 20.979 4.48e-06 ***
s(delay,medu)   143.147 849.000 19.762  < 2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) =  0.0792   Deviance explained = 9.63%
fREML = -1.8252e+05  Scale est. = 0.0089037  n = 179659
> gam.check(m4)
Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.
                 k' edf k-index p-value
s(delay):group0   9   1    0.98    0.60
s(delay):group1   9   1    0.98    0.64
s(delay,medu)   850 143    0.98    0.64

Summary(m4_of): without 'k=20'

> summary(m4_of)
Family: inverse.gaussian 
Link function: identity
Formula:
bmk ~ groupof + s(delay) + s(delay, by = groupof) + s(delay, 
    medu, bs = "fs")
Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)

(Intercept) 11.70962    0.16160   72.46   <2e-16 ***
groupof1     0.33310    0.02948   11.30   <2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
                      edf  Ref.df      F p-value

s(delay)            1.004   1.007  0.004 0.97162

s(delay):groupof1   1.003   1.006 10.304 0.00129 ** 
s(delay,medu)     167.479 848.000 19.769 < 2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) =  0.0792   Deviance explained = 9.65%
fREML = -1.8249e+05  Scale est. = 0.008902  n = 179659
> gam.check(m4_of)
Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.
                   k' edf k-index p-value
s(delay)            9   1    0.97    0.34
s(delay):groupof1   9   1    0.97    0.32
s(delay,medu)     850 167    0.97    0.38

Compare models:

> itsadug::compareML(m4, m4_of)
Model m4 preferred: lower fREML score (35.737), and equal df (0.000).
-----
  Model     Score Edf Difference    Df
1 m4_of -182486.6   9                 
2    m4 -182522.4   9     35.737 0.000
AIC difference: -13.44, model m4 has lower AIC.

Plots:

Hierarchical gam (HGAM) with 'by' ordered factor and random smooth 'fs' effect: why so wiggly?

1 Answers1