1

I'm reading about Kingma's reparameterization trick (section 2.4.4) for changing the random variable $z$ for another random variable $\epsilon$, and I don't understand the calculation of the density function $q(z)$.

In the simplest univariate Gaussian case, $q(z) = N(z,\mu, \sigma^2)$ is a Gaussian, and we wish to rewrite it as a function of $\epsilon ~ N(0, 1)$ which is a standard Gaussian. So we write $z=\mu + \sigma\epsilon$ and now they claim that we can write $q(z) = \frac{N(0,1)}{\sigma}$. Why do we divide by $\sigma$? how will it come out that q is a valid PDF that sums to 1?

ihadanny
  • 3,300
  • 2
    See https://stats.stackexchange.com/search?q=pdf+transform+score%3A5+is%3Aanswer. Unfortunately, your notation is both vague and incorrect: $q$ is not a standard Normal divided by $\sigma.$ Its density function is related to a Normal density function divided by $\sigma,$ though. Would this be what you are trying to write? If so, see https://stats.stackexchange.com/a/49794/919 for an intuitive explanation of why a factor of $1/\sigma$ appears. For information on Jacobians generally, see https://stats.stackexchange.com/search?tab=votes&q=jacobian. – whuber Sep 30 '22 at 13:54
  • thanks! what do you mean by q is not a standard normal divided by $\sigma$, its density function is related to a normal density - q is a density function! it's a continuous random variable that is characterized by a density function so that's how we write it, no?? – ihadanny Oct 01 '22 at 08:31
  • @whuber - can you please take a look at my answer and comment? – ihadanny Oct 01 '22 at 10:10

1 Answers1

0

OK, thanks to @whuber excellent answer, I realized that I was actually asking why is it that for general normal distributions if $z$ has a distribution with mean $\mu$ and variance $\sigma^2$, it will have a PDF of $\frac{1}{\sigma}f(\frac{z-\mu}{\sigma})$, where $f$ is the PDF of a standard normal distribution.

Well, it stems from the fact that the CDF isn't changed from the change of location and scale, so $CDF(z)=\Phi(\frac{z-\mu}{\sigma})$ where $\Phi$ is the CDF of the standard normal distribution. So when we want the PDF, we need to take the derivative and use the chain rule, so:

$$PDF(z)=\frac{d}{dz}\Phi(\frac{z-\mu}{\sigma})=\frac{1}{\sigma}f(\frac{z-\mu}{\sigma})$$


More generally, the reparameterization trick relies on $z=g(\epsilon)$ being an invertible function. E.g. in our case $\epsilon=\frac{z-\mu}{\sigma}$, so:

$$PDF(z)=\frac{d}{dz}CDF(g^{-1}(z))=PDF(\epsilon)\frac{d}{dz}g^{-1}(z)$$

In the paper, they express this as $PDF(z)=PDF(\epsilon)\frac{1}{\frac{dz}{d\epsilon}g(z)}$ which I think is equal due to the inverse function rule

ihadanny
  • 3,300