Model section when AIC and logLik differ

Question

To analyze my ordinal (Liker scale ratings) I used the clmm function for the ordinal package. Since I have fixed and random effect, I try do backward model selection (of nested model) using the anova function. When there is a significant difference given by Chisq, I used that model. However, my understand was that in the cases where the difference is not significant to select the model with the lowest AIC value, but I have also heard to select the model with the lowest logLik value. In my current model comparison those values don't go hand in had. Can someone tell me what selection criteria to follow in those cases and if selected the one with the lowest AIC is correct?

anova(ord.model1,
      ord.model2)
Likelihood ratio tests of cumulative link models:
       no.par    AIC  logLik LR.stat df Pr(&gt;Chisq)

ord.model2     19 2247.5 -1104.8

ord.model1     20 2249.0 -1104.5  0.5187  1     0.4714
formula:                                                             link:      threshold:
ord.model2  rating ~ factor1 + (factor1 + factor2 | item) + (factor1 + factor2 | subject)    logit  flexible
ord.model1  rating ~ factor1 + factor2 + (factor1 + factor2 | item) + (factor1 + factor2 | subject) logit   flexible
```

You should not use backwards selection to select a model, unless this is purely exploratory and you will fit a final model on a completely unrelated dataset. It may help you to read my answer to Algorithms for automatic model selection. If you have only two models, & a-priori your plan was to compare and go with the selected model, a single test shouldn't invalidate your results. Setting those issues aside, the logLik doesn't take the model's flexibility into account, whereas the AIC does, so the latter would be preferable. — gung - Reinstate Monica, Jan 03 '23 at 19:05
There are many related CV threads. Here is one that throws the BIC criterion into the mix as well: AIC, BIC and log likelihood which more important?. — dipetkov, Jan 03 '23 at 22:02

score 2 · Accepted Answer · answered Jan 03 '23 at 21:00

If the models are nested then likelihood (and the logarithm) is always equal or higher for the model with more parameters, because the model with more parameters makes a better fit. So comparing likelihood, which one is higher or lower, says very little about which model is better.

The AIC compares the likelihood and the parameters.

$$AIC = 2k - 2\text{LogLik} = \text{estimate of information loss}$$

Where $k$ is the number of model parameters and $\text{LogLik}$ is the likelihood.

The AIC is an estimate for the information loss (KL divergence between the model and the true data generating process). Using it for a model comparison makes sense when you wish to select a model that has the least information loss.

Model section when AIC and logLik differ

1 Answers1