1

Say, there is a single $n$-dimensional multivariate Gaussian. $$Gauss_a(\mu_a,\Sigma_a) $$ $\mu_a$ is $1\times n$ vector and $\Sigma_a$ is $n\times n$ matrix.

Is there any easy way to decompose/split a single gaussian $Gauss_a$ into random multiple $K$ gaussians as a multivariate Gaussian mixture models where no its Gaussian component is identical to original one, $Gauss_a$.($K$ is given)

$$Gauss_a(\mu_a,\Sigma_a) \approx \sum_{i=1}^Kw_i \cdot Gauss_i(\mu_i, \Sigma_i)$$ where $ Gauss_i(\mu_i, \Sigma_i) \neq Gauss_a , w_i\neq 1$

Thank you.

JimSD
  • 413
  • 2
    Of course you can approximate it: just set $w_1\approx 1, \mu_1 \approx \mu_a, \Sigma_i\approx \Sigma_a,$ and all other $w_i\approx 0$ for $i \ne 1.$ Might I therefore suggest that you remove the "approximately express" part of this question? – whuber Aug 31 '18 at 16:12
  • Ahh,, what i mean above is "except" the case identical gauss you just mentioned.. – JimSD Aug 31 '18 at 22:24
  • Question https://math.stackexchange.com/questions/3183207/gaussian-mixtures-split-methods-using-singular-value-decomposition shows the split methods, but without proof – zzqstar Apr 11 '19 at 02:23

1 Answers1

1

It's not too hard to show that this isn't possible in general. For a counterexample, consider the 1-dimensional case with $K=2$ and $Gauss_a(0,1)$ the standard normal, and suppose we had the decomposition $$Gauss_a(0,1) = w_1 Gauss_1(\mu_1,\sigma_1) + w_2 Gauss_2(\mu_2,\sigma_2)$$ for some parameters $w_i$, $\mu_i$, and $\sigma_i$. We can calculate the moment generating function of each side and equate them: $$ \exp\left(\tfrac12t^2\right) = w_1\exp\left(\mu_1 t + \tfrac12\sigma_1^2t^2\right)+ w_2\exp\left(\mu_2 t + \tfrac12\sigma_2^2t^2\right) $$ Note that the moment generating function (or any expectation) of a mixture distribution is easy to calculate -- it's just a weighted sum of the expectations of the mixed distributions.

The nice thing about this equation is that the only way it can hold for all $t$ is if all the coefficients in the power series expansion for $t$ match up. I used SymPy (a symbolic mathematics library for Python) to (albegraically) solve the system of equations for the first five coefficients:

from sympy import symbols, solve, exp, Eq, diff, N

t,w1,mu1,v1,mu2,v2 = symbols('t,w1,mu1,v1,mu2,v2', real=True)

lhs = exp(t**2/2)
rhs = w1 * exp(mu1 * t + v1*t**2/2) + (1-w1) * exp(mu2 * t + v2*t**2/2)
solve([Eq(diff(lhs, t, k).subs(t,0), diff(rhs, t, k).subs(t,0))
       for k in range(1,5)], check=True)

and it determined that the only exact solution was the trivial one where the components of the mixture are identically distributed:

Out[4]: [{mu1: 0, mu2: 0, v1: 1, v2: 1}]

So, there is no non-trivial solution in the univariate case with $K=2$. My guess is that this is true in the general multivariate case for any $K>1$ and there are no non-trivial solutions, period, though I'm not sure how to go about proving it.

  • 2
    An easier way to demonstrate the impossibility is to examine the asymptotic behavior of the log pdf at infinity: that establishes all the components must have the same values of $\Sigma.$ After that, it's straightforward to show they must all have the same value of $\mu,$ too. – whuber Sep 01 '18 at 15:03