I think this answer comes a bit late and is not complete, but my understanding on the subject is that information criteria, in its general form, is a bias corrected log-likelihood function. That is, when we do modeling in many contexts, the notion of maximum likelihood comes into play. We prefer to select models which have a high (log-)likelihood value with respect to the observed data.
However, because we use the observed data for both fitting the model and evaluating its performance via the likelihood value, it follows that this likelihood value contains (optimistic) bias. In machine learning context, this bias is why for example we do not use the same data for training the model and testing its (out-of-sample) performance, and instead we split our data into training and test sets.
So the (log-)likelihood value requires a bias correction into it in order to not get an optimistically biased likelihoods. And so, in general form, information criteria is:
$$\text{Information criteria = log-likelihood value under the approximation model - bias value}$$
Takeuchi's information criteria (TIC) is to my understanding a general asymptotic form of the information criteria, when there are no assumptions made about the approximation model (for example that the true model is contained in our selected hypothesis function space), and AIC for example is a special case of TIC.
I previously wrote an example about TIC (and AIC) in this post, if it's any help to you:
Where can I find examples of Takeuchi Information Criterion (TIC) at work?