Intuition behind log in kl distance

Question

So, let's start stating that I already read both Why KL-Divergence uses "ln" in its formula? and What is the role of the logarithm in Shannon's entropy? ... However, I still have no intuition behind the use of $\log$ in the formula.

In particular, we can se the KL as the following formula: $$ D_{KL}(P\mid\mid Q) = \int_{\cal{X}} P(x) \cdot [\log(P(x)) - \log(Q(X))] $$ And this intuitively, at least to me, says something like:
for every point in the support of the distributions, take the PDF of both of them, check the difference, and weight it based on the ""probability"" that you observe that element

Which has a clear intuition.. you are not just considering how "far" are the two distribution, but you are weighting that distance based on the likelihood of the first distribution (in a sense that, if you are approximating $P$ with $Q$, if something is unlikely to be observed in $P$, then it does not matter the fact that $Q$ has a very big error on that point, because since is unlikely that you observe that point, is unlikely that you will see that error)

However, this "phrasing" of the KL distance, should corresponds to: $$ D(P\mid\mid Q) = \int_{\cal{X}} P(x) \cdot [P(x) - Q(X)] $$ And to have always positive distances, then we can use the abs value: $$ D(P\mid\mid Q) = \int_{\cal{X}} P(x) \cdot |P(x) - Q(X)| $$

Now my question is either, how I should interpret the KL to incorporate the $\log$, or, why the last distance (which uses the abs value) makes no sense.

Maybe it has to do with KL being an expectation of the log-likelihood ratio statistic, see https://stats.stackexchange.com/questions/188903/intuition-on-the-kullback-leibler-kl-divergence/189758#189758 — kjetil b halvorsen, Jun 21 '22 at 14:18
thank you @kjetilbhalvorsen, I'll certainly check it out ASAP — Alberto, Jun 21 '22 at 14:20

Intuition behind log in kl distance

0 Answers0