1

I have read this page but am a little confused and I think a real example might help solidify the idea in my mind regarding how to use the AIC in model selection.

Equivalence of AIC and p-values in model selection

Say I have two nested models (that differ in 6 parameters) - these are from real data. The simpler model is mod1 and the more complex, mod2.

> AIC(mod1)
[1] 191.2335
> AIC(mod2)
[1] 190.5257

> BIC(mod1) [1] 206.9418 > BIC(mod2) [1] 225.084

> lmtest::lrtest(mod1,mod2) Likelihood ratio test

Model 1: y ~ x1 + x2 + x3 + x4 Model 2: y ~ x1(NLR) + rcs(x2) + x3 + x4 #Df LogLik Df Chisq Pr(>Chisq)
1 5 -90.617
2 11 -84.263 6 12.708 0.04792 *


Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The p-value from the LR test would suggest that the model fit is compromised with the simpler model (and thus mod2 is better).

The AIC is lower for mod2 but not by a margin that I think suggests it's the optimal fit. What I want to know is how much lower (on 6 df) does the AIC need to be for mod2 (the more complex model in this case) to justify the extra parameters? Is it 6 x 2 = 12 units?

Clearly the BIC would suggest mod1.

Richard Hardy
  • 67,272
LucaS
  • 619

2 Answers2

1

AIC takes model complexity into account by construction. You do not need to adjust for that any further. If you like adjusting for model complexity manually, you can compare likelihoods and adjust them. But then you end up with... AIC (or BIC, or some other IC)! So there is really no point in doing that by hand if you have AIC available.

If the AIC of model 2 is lower than the AIC of model 1 (AIC2<AIC1), then the estimated expected likelihood of a new observation from the same data generating process is higher in model 2. If that is what you are looking for, select model 2. If you are looking for something else, perhaps AIC is not the criterion you want to use.

(You may also consider model averaging, especially if the differences in AICs between the models are not too large.)

Richard Hardy
  • 67,272
  • So it is truly nothing more than picking the model with the lowest AIC (if that's want you want to do). So if you have a model with an extra parameter that is lower on the AIC by only a fraction, that is still a better fit based on AIC? I read somewhere that AIC for a more complex model with 1 extra df should be lower by 2 (and that is equivalent to p = 0.15). Clearly I have more work to do to understand this. – LucaS Dec 01 '22 at 06:33
  • @LucaS, AIC tells you which model is best in a particular sense. Would you rather select another model that is worse in that specific sense? I guess not. On the other hand, you can consider model averaging, especially if the differences in AICs between the models are small. AIC does not tell you cannot do that. It just serves its purpose, and it is up to you whether you find that relevant. Regarding I read somewhere that AIC for a more complex model with 1 extra df should be lower by 2 (and that is equivalent to p = 0.15)., it is the loglikelihood, not AIC. – Richard Hardy Dec 01 '22 at 07:10
0

How big a difference in AIC you need to be able to say 'mod2 is better than mod1' is going to depend a bit on your field, your data, your sample sizes etc. Generally, in ecology, models are considered approximately equal (in terms of performance) if the AIC value for two models is within 2 AIC.

As you pointed out, BIC favours mod1 over mod2 because BIC is a system of model evaluation that more strongly 'penalizes' having more variables/ predictors, and mod1 has fewer predictors. This results in more conservative approach. Choosing the appropriate metric will depend on your data and your question, but this paper may be helpful: Model selection for ecologists: the worldviews of AIC and BIC.

User1865345
  • 8,202
SageR
  • 25