4

In the context of mixture models in Bayesian inference, one can assume that the general form of the joint posterior for a mixture model of $k$ components is

$$ \begin{equation} p( \boldsymbol{\lambda} , \boldsymbol{\theta} \mid \boldsymbol{y}) \propto \bigg[ \prod_{i=1}^{n} \sum_{j=1}^{k} \lambda_{j} f(y_{i} \mid \boldsymbol{\theta}_{j} ) \bigg] \ p(\boldsymbol{\lambda} ,\boldsymbol{\theta} ) \ . \end{equation} $$

Incorporating information about the missing variable $Z$ and thus using the augmented likelihood, the posterior accounting for the complete data structure reads (this formula is given in the excellent book "Bayesian Core" by Marin and Robert (2007))

$$ \begin{equation} p(\boldsymbol{\theta}, \boldsymbol{\lambda} \mid \boldsymbol{y}, \boldsymbol{z}) \propto \bigg[ \prod_{i=1}^{n} \prod_{j=1}^{k} \lambda_{ j}^{z_{ij}} f (y_{i} \mid \boldsymbol{\theta}_{j})^{z_{ij}} \bigg] \ \ p(\boldsymbol{\lambda} ,\boldsymbol{\theta} ) \ . \end{equation} $$

However, one can assume the following model decomposition for the joint posterior of all variables if we use independent prior for $\boldsymbol{\theta} = (\boldsymbol{\mu}, \boldsymbol{\sigma}^{2})$. $$ p(\boldsymbol{y}, \boldsymbol{z}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma}^{2}) = p(\boldsymbol{y} \mid \boldsymbol{z}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma^{2}}) \ p(\boldsymbol{z} \mid \boldsymbol{\lambda}) \ p(\boldsymbol{\lambda}) \ p(\boldsymbol{\mu}) \ p(\boldsymbol{\sigma^{2}}) \\ = p(\boldsymbol{y} \mid \boldsymbol{\Psi}) \ p(\boldsymbol{\Psi}) \ .\\ $$

where $\boldsymbol{\Psi} = (\boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma^{2}})$

I find it hard to see how the expression $p(\boldsymbol{\theta}, \boldsymbol{\lambda} \mid \boldsymbol{y}, \boldsymbol{z}) $ relates to $p(\boldsymbol{y}, \boldsymbol{z}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma}^{2})$. Any hints ?

Xi'an
  • 105,342
julian
  • 55
  • can you point us to the section & page? I'm having hard time to understand variable $z$, exponentiation, and conversion of inner summation to inner multiplication. – gunes Jan 17 '19 at 15:00
  • Its on chapter 6, p. 151, but the author writes the likelihood with subscripts, that is $ \prod_{i=1}^{n} p_{z_{i}} f(y_{i} \mid \theta_{z_{i}})$. The $z_{i}$ comes from a multinomial distribution of parameter $\lambda_{j}$, for $j=1,...,k$ and it is inspired from the way we write a likelihood for a bernoulli distribution. This formula for the likelihood comes from the book "Modèles à variables latentes et modèles de mélange" from Droesbeke, Saporta and Thomas-Agnan. – julian Jan 17 '19 at 15:18

1 Answers1

4

Thank you for the appreciation of our book! By Bayes' formula, the posterior density $$p(\boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma}^{2}|\boldsymbol{y}, \boldsymbol{z})$$ is proportional to the joint density $$p(\boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma}^{2}|\boldsymbol{y}, \boldsymbol{z})\propto p(\boldsymbol{y}, \boldsymbol{z}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma}^{2})$$ which itself decomposes into $$p(\boldsymbol{y}, \boldsymbol{z}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma}^{2}) = \overbrace{p(\boldsymbol{y} \mid \boldsymbol{z}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma^{2}})}^\text{completed model} \ \overbrace{p(\boldsymbol{z} \mid \boldsymbol{\lambda})}^\text{latent model} \ \overbrace{p(\boldsymbol{\lambda}) \ p(\boldsymbol{\mu}) \ p(\boldsymbol{\sigma^{2}})}^\text{prior} $$ and also $$p(\boldsymbol{y}, \boldsymbol{z}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma}^{2})= p(\boldsymbol{y}, \boldsymbol{z} \mid \boldsymbol{\Psi}) \ p(\boldsymbol{\Psi}) \ .\\$$ where $$\boldsymbol{\Psi}=(\boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\sigma}^{2})=(\boldsymbol{\lambda}, \boldsymbol{\theta})$$ (The $\boldsymbol{z}$ is missing in the second row of your equation.)

Note: The representation of the likelihood is much older than the book "Modèles à variables latentes et modèles de mélange" from Droesbeke, Saporta and Thomas-Agnan. See for instance Dempster, Laird and Rubin (1979).

Xi'an
  • 105,342