0

The basic KL-Divergence between two distributions is as:

$KL(N(\mu_1,\sigma_1) || N(\mu_2, \sigma_2)) = \log \frac{\sigma_2}{\sigma_1} + \frac{\sigma_1^2 + (\mu_1 - \mu_2)^2}{2 \sigma_2^2} - \frac{1}{2}$

I am reading VAE neural networks, that use KL-Divergence against $N(0, 1)$. However, in the articles the equation is in form:

$0.5 * \sum(1 + log(\sigma^2) - \sigma^2 - \mu^2)$

How is this second equation derived from the "original" one?

I came up with this:

$\mu_2 = 0$

$\sigma_2 = 1$

which gives

$log \frac{1}{\sigma_1} + \frac{\sigma_1^2 + (0 - \mu_2)^2}{2} - \frac{1}{2}$ =

$log (1) - log (\sigma_1) + \frac{\sigma_1^2 + \mu_2^2}{2} - \frac{1}{2}$ =

$-2log (\sigma_1) + \sigma_1^2 + \mu_2^2 - 1$ =

$-log (\sigma_1^2) + \sigma_1^2 + \mu_2^2 - 1$ =

$1 + log (\sigma_1^2) - \sigma_1^2 - \mu_2^2$

but 0.5 before $\sum$ is mystery to me.

0 Answers0