9

My question is similar to this one but considers a more general situation.

Suppose that $ \vec{x} = (x_1, \dots, x_d) $ and let $$ p(\vec{x}) = \sum_{k=1}^{n} \pi_k \mathcal{N}(\vec{x} | \mu_k, \Sigma_k) $$ be a mixture of multivariate Gaussians. What is the conditional distribution $ p(\vec{x_a} | \vec{x_b}), $ where $ \vec{x} = (\vec{x_a}, \vec{x_b}) $? Is this still a mixture of Gaussians?

1 Answers1

10

So, after a little while, I am able to answer my own question :) According to the definition of conditional density, we have $$ f(\vec{x_a} | \vec{x_b}) = \frac{f(\vec{x_a}, \vec{x_b})}{f(\vec{x_b})} = \sum_{k=1}^{n} \frac{\pi_k}{f(\vec{x_b})} \mathcal{N}(\vec{x} | \mu_k, \Sigma_k). $$ where $ f(\vec{x_b}) $ is the marginal distribution. This can be calculated in terms of 1) the marginals and conditionals of multivariate Gaussians 2) the marginals of multivariate Gaussian mixtures.

Suppose that $ \mu $ and $ \Sigma $ is partitioned as $$ \mu = (\mu_a, \mu_b) \text{ and } \Sigma = \begin{pmatrix} \Sigma_{aa} & \Sigma_{ab} \\ \Sigma_{ba} & \Sigma_{bb} \end{pmatrix}. $$ According to Bishop, Pattern Recognition and Machine Learning, Chapter 2.3.1-2.3.2, the conditional distributions of multivariate Gaussians can be calculated as $$ \mathcal{N}(\vec{x_a} | \vec{x_b}, \mu, \Sigma) = \mathcal{N}(\vec{x_a} | \mu_{a|b}, \Sigma_{a|b}), $$ where $$ \begin{aligned} \mu_{a|b} & = \mu_a + \Sigma_{ab}\Sigma_{bb}^{-1}(x_b - \mu_b) \\ \Sigma_{a|b} & = \Sigma_{aa} - \Sigma_{ab}\Sigma_{bb}^{-1} \Sigma_{ba}, \end{aligned} $$ while the marginal is simply given by $ \mathcal{N}(\vec{x_b} | \mu_b, \Sigma_{bb}) $. Regarding the marginal of the MGM, we simply have $$ f(\vec{x_b}) = \sum_{k=1}^{n} \pi_k \mathcal{N}(\vec{x_b} | \mu_{k, b}, \Sigma_{k, bb}), $$ so by continuing the calculation, we have $$ \begin{aligned} f(\vec{x_a} | \vec{x_b}) &= \sum_{k=1}^{n} \frac{\pi_k}{f(\vec{x_b})} \mathcal{N}(\vec{x} | \mu_k, \Sigma_k) \\ & = \sum_{k=1}^{n} \frac{\pi_k \mathcal{N}(\vec{x_b} | \mu_{k, b}, \Sigma_{k, bb})}{f(\vec{x_b})} \mathcal{N}(\vec{x_a} | \mu_{k, a|b}, \Sigma_{k, a|b}) \\ & = \sum_{k=1}^{n} \frac{\pi_k \mathcal{N}(\vec{x_b} | \mu_{k, b}, \Sigma_{k, bb})}{\sum_{l=1}^{n} \pi_l \mathcal{N}(\vec{x_b} | \mu_{l, b}, \Sigma_{l, bb})} \mathcal{N}(\vec{x_a} | \mu_{k, a|b}, \Sigma_{k, a|b}). \end{aligned} $$ Thus, it is indeed a multivariate Gaussian mixture with components $ \mathcal{N}(\vec{x_a} | \mu_{k, a|b}, \Sigma_{k, a|b}) $ and mixing coefficients $$ \pi_{k, a|b} = \frac{\pi_k \mathcal{N}(\vec{x_b} | \mu_{k, b}, \Sigma_{k, bb})}{\sum_{l=1}^{n} \pi_l \mathcal{N}(\vec{x_b} | \mu_{l, b}, \Sigma_{l, bb})}. $$