3

I am looking for a way to simulate ground-truth conditional entropy. Say I have $\mathbf{X}$ is a 3-dimensional multivariate Gaussian random variable. I am interested in computing the ground-truth of conditional entropy $H(X_1 | X_2, X_3)$.

I know that there is an analytical solution for the entropy $H(X)$, which is a function of the number of dimensions and covariance matrix (e.g. see this answer). I am wondering, is there an analytical solution for conditional Gaussian random variables?

Ben
  • 124,856
ajl123
  • 287

1 Answers1

3

To derive the conditional entropy formula in this case, you can take advantage of the properties of the multivariate normal distribution. In particular, it is well-known that the distribution preserves its form under conditioning, so the conditional distribution of any set of elements given any other set of elements still has a multivariate (or univariate) normal distribution. To derive the relevant formula, let's start by looking at the entropy in the marginal case. The entropy for the marginal distribution of $X \sim \text{N}(\mu, \sigma^2)$ is:

$$\begin{align} H(X) &= - \int \limits_{-\infty}^\infty f(x) \log f(x) \ dx \\[6pt] &= - \int \limits_{-\infty}^\infty f(x) \Bigg( -\frac{1}{2} \bigg( \frac{x-\mu}{\sigma} \bigg)^2 - \log(\sqrt{2 \pi} \sigma) \Bigg) \ dx \\[6pt] &= \frac{1}{2} \int \limits_{-\infty}^\infty \bigg( \frac{x-\mu}{\sigma} \bigg)^2 f(x) \ dx + \log(\sqrt{2 \pi} \sigma) \int \limits_{-\infty}^\infty f(x) \ dx \\[6pt] &= \frac{1}{2} \times 1 + \log(\sqrt{2 \pi} \sigma) \times 1 \\[12pt] &= \frac{1}{2} + \log(\sqrt{2 \pi} \sigma) \\[12pt] &= \frac{1}{2} \bigg[ 1 + \log(2 \pi \sigma^2) \bigg]. \\[6pt] \end{align}$$

Observe that this quantity depends only on the variance of the normal distribution, not its mean. Now, consider a vector $(X_1,X_2,X_3)$ following a multivariate normal distribution with variance matrix:

$$\mathbb{V}(X_1,X_2,X_3) = \begin{bmatrix} \sigma_1^2 & \sigma_{12} & \sigma_{13} \\ \sigma_{12} & \sigma_2^2 & \sigma_{23} \\ \sigma_{13} & \sigma_{23} & \sigma_3^2 \\ \end{bmatrix}.$$

It is well-known that the conditional distribution of elements in the vector are normally distributed (see e.g., this related answer), so it will use the same entropy formula as above, but with the conditional variance in place of the marginal variance. The variance of the first element given the other two is:

$$\begin{align} \mathbb{V}(X_1|X_2,X_3) &= \sigma_1^2 - \begin{bmatrix} \sigma_{12} & \sigma_{13} \end{bmatrix} \begin{bmatrix} \sigma_2^2 & \sigma_{23} \\ \sigma_{23} & \sigma_3^2 \end{bmatrix}^{-1} \begin{bmatrix} \sigma_{12} \\ \sigma_{13} \end{bmatrix} \\[6pt] &= \sigma_1^2 - \frac{1}{\sigma_2^2 \sigma_3^2 - \sigma_{23}^2} \begin{bmatrix} \sigma_{12} & \sigma_{13} \end{bmatrix} \begin{bmatrix} \sigma_3^2 & -\sigma_{23} \\ -\sigma_{23} & \sigma_2^2 \end{bmatrix} \begin{bmatrix} \sigma_{12} \\ \sigma_{13} \end{bmatrix} \\[6pt] &= \sigma_1^2 - \frac{1}{\sigma_2^2 \sigma_3^2 - \sigma_{23}^2} \Bigg[ \sigma_{12}^2 \sigma_3^2 + \sigma_{13}^2 \sigma_2^2 - 2 \sigma_{12} \sigma_{13} \sigma_{23} \Bigg] \\[6pt] &= \frac{1}{\sigma_2^2 \sigma_3^2 - \sigma_{23}^2} \Bigg[ \sigma_1^2 (\sigma_2^2 \sigma_3^2 - \sigma_{23}^2) - \sigma_{12}^2 \sigma_3^2 - \sigma_{13}^2 \sigma_2^2 + 2 \sigma_{12} \sigma_{13} \sigma_{23} \Bigg] \\[6pt] &= \frac{\sigma_1^2 \sigma_2^2 + 2 \sigma_{12} \sigma_{13} \sigma_{23} \sigma_3^2 - \sigma_1^2 \sigma_{23}^2 - \sigma_{12}^2 \sigma_3^2 - \sigma_{13}^2 \sigma_2^2}{\sigma_2^2 \sigma_3^2 - \sigma_{23}^2}. \\[6pt] \end{align}$$

Substituting this new variance expression into the original entropy formula you get the conditional entropy formula:

$$\begin{align} H(X_1|X_2,X_3) &= \frac{1}{2} \bigg[ 1 + \log(2 \pi \mathbb{V}(X_1|X_2,X_3)) \bigg] \\[6pt] &= \frac{1}{2} \bigg[ 1 + \log \bigg(2 \pi \cdot \frac{\sigma_1^2 \sigma_2^2 + 2 \sigma_{12} \sigma_{13} \sigma_{23} \sigma_3^2 - \sigma_1^2 \sigma_{23}^2 - \sigma_{12}^2 \sigma_3^2 - \sigma_{13}^2 \sigma_2^2}{\sigma_2^2 \sigma_3^2 - \sigma_{23}^2} \bigg) \bigg] \\[6pt] &= \frac{1}{2} \bigg[ 1 + \log(2 \pi) + \log (\sigma_1^2 \sigma_2^2 + 2 \sigma_{12} \sigma_{13} \sigma_{23} \sigma_3^2 - \sigma_1^2 \sigma_{23}^2 - \sigma_{12}^2 \sigma_3^2 - \sigma_{13}^2 \sigma_2^2) \\[6pt] &\quad \quad \quad \ \ - \log(\sigma_2^2 \sigma_3^2 - \sigma_{23}^2) \bigg]. \\[6pt] \end{align}$$

This type of derivation can be extended to get the conditional entropy formula any number of elements, conditional on any other number of elements. In the more general case you would use the entropy for a multivariate normal distribution and substitute the general form for the conditional variance.

Ben
  • 124,856
  • Tagging this other question which seems relevant: https://math.stackexchange.com/questions/3907068/how-does-1-rho2-in-the-bivariate-case-turn-into-det-boldsymbol-sigma-i/3908332#3908332 – ajl123 Jan 08 '24 at 19:47