4

Someone told me that the Gaussian distribution is conjugate to the Gaussian distribution because a Gaussian times a Gaussian would still be Gaussian distribution.

Why is that ? Say the following situation: $X\sim N(\mu_x,\sigma^2_{x})$ , $Y\sim N(\mu_y,\sigma^2_{y})$

Would a new variable, $Z=XY$ be normally distributed?

whuber
  • 322,774
GeekCat
  • 101
  • 10
    Beware! The expression $Z=XY$ means to take the product of the random variables $X$ and $Y$, not their densities (and the distribution of such a $Z$ would not be Gaussian, but quite peaked); $f_X(z)f_Y(z)$ is the product of the densities, and that is proportional to another Gaussian. It's the second thing you would be looking at when talking about conjugacy. – Glen_b Nov 19 '14 at 04:53
  • Would you mind to comment on that ? what do you mean by $f_x(z)$? – GeekCat Nov 20 '14 at 03:27
  • $f_X(z)$ is the density of the random variable $X$ evaluated at $z$. The product of the value taken by the two densities at $z$ is the thing that is proportional to another Gaussian density (i.e. you need the product of the densities, not the density of the product). – Glen_b Nov 20 '14 at 05:44

2 Answers2

14

If we take your question to mean whether the product of the densities are Gaussian, then the answer is "yes" (P.A. Bromiley. Tina Memo No. 2003-003. "Products and Convolutions of Gaussian Probability Density Functions.").

Here's a simple derivation.

Take $f(x)$ and $g(x)$ to be two normal densities with means $\mu_f$ and $\mu_g$ and precisions $$\tau_f=\frac{1}{2\sigma_f^2}$$ and $$\tau_g=\frac{1}{2\sigma_g^2}.$$

The logarithm of the product $f(x)g(x)$ is the sum of the logarithms. $$ \begin{align} \log f(x) + \log g(x) &= \log C - \tau_f(x -\mu_f)^2 - \tau_g(x-\mu_g)^2 \\ &= \log C - (\tau_f+\tau_g)\left[2x^2 - 2x(\mu_f+ \mu_g) + \mu_f^2 + \mu_g^2 \right] \end{align} $$

The sum of quadratics is quadratic, so we know that this is a normal density without doing any more work. Specifically, inspection gives the mean and precisions.

Here's a longer, more tedious derivation.

Take $f(x)$ and $g(x)$ to be two normal densities with means $\mu_f$ and $\mu_g$ and variances $\sigma_f^2$ and $\sigma_g^2$.

The product is $$f(x)g(x)=\frac{1}{2\pi\sigma_f\sigma_g}\exp\left(-\frac{(x-\mu_f)^2}{2\sigma_f^2}-\frac{(x-\mu_g)^2}{2\sigma_g^2}\right).$$

Denote $\beta=\frac{(x-\mu_f)^2}{2\sigma_f^2}+\frac{(x-\mu_g)^2}{2\sigma_g^2}.$

Expand: $$\beta=\frac{(\sigma^2_f+\sigma^2_g)x^2-2(\mu_f\sigma^2_g+\mu_g\sigma^2_f)x+ \mu^2_f\sigma^2_g+\mu^2_g\sigma^2_f} {2\sigma^2_f\sigma^2_g}$$

Divide through by the coefficient of the leading power, $x^2$: $$\beta=\frac{x^2-2\frac{\mu_f\sigma^2_g+\mu_g\sigma^2_f}{\sigma^2_f+\sigma^2_g}x+\frac{\mu_f^2\sigma^2_g+\mu^2_g\sigma^2_f}{\sigma^2_f+\sigma^2_g}}{2\frac{\sigma^2_f\sigma^2_g}{\sigma^2_f+\sigma^2_g}}$$

This is quadratic in $x$, so it's Gaussian. But if we continue with the algebra, we can make this even more explicit.

Completing the square is a procedure that expresses a quadratic in $x$ with the form $(x+b)^2$. We can apply this here. If $\epsilon$ is the term required to complete the square in $\beta$, $$\epsilon=\frac{\left(\frac{\mu_f\sigma^2+\mu_g\sigma^2_f}{\sigma_f^2+\sigma_g^2}\right)- \left(\frac{\mu_f\sigma_g^2+\mu_g\sigma_f^2}{\sigma_f^2+\sigma_g^2}\right)}{2\frac{\sigma^2_f\sigma^2_g}{\sigma^2_f+\sigma^2_g}}=0.$$

We add this to $\beta$. Its value is zero, so it does not change the value of $\beta$ for the same reason that $5+0=5$. However, it does allow us to re-express $\beta:$

$$\begin{align} \beta&=\frac{x^2- 2\frac{\mu_f\sigma^2_g+\mu_g\sigma^2_f}{\sigma^2_f+\sigma^2_g}x+ \left(\frac{\mu_f^2\sigma^2_g+\mu^2_g\sigma^2_f} {\sigma^2_f+\sigma^2_g}\right)^2} {2\frac{\sigma^2_f\sigma^2_g}{\sigma^2_f+\sigma^2_g}}+ \frac{\left(\frac{\mu_f\sigma^2+\mu_g\sigma^2_f}{\sigma_f^2+\sigma_g^2}\right)- \left(\frac{\mu_f\sigma_g^2+\mu_g\sigma_f^2}{\sigma_f^2+\sigma_g^2}\right)^2}{2\frac{\sigma^2_f\sigma^2_g}{\sigma^2_f+\sigma^2_g}}\\ &=\frac{\left(x- \frac{\mu_f\sigma_g^2+\mu_g\sigma_f^2} {\sigma_f^2+\sigma_g^2}\right)^2} {2\frac{\sigma^2_f\sigma_g^2} {\sigma_f^2+\sigma_g^2}}+ \frac{(\mu_f-\mu_g)^2}{2(\sigma_f^2+\sigma_g^2)}\\ &=\frac{(x-\mu_{fg})^2}{2\sigma^2_{fg}}+\frac{(\mu_f-\mu_g)^2}{2(\sigma_f^2+\sigma_g^2)} \end{align}$$

Where $$\mu_{fg}=\frac{\mu_f\sigma^2_g+\mu_g\sigma_f^2}{\sigma_f^2+\sigma_g^2}$$ and $$\sigma_{fg}^2=\frac{\sigma_f^2\sigma_g^2}{\sigma_f^2+\sigma_g^2}.$$

So $$f(x)g(s)=\frac{1}{2\pi\sigma_f\sigma_g}\exp\left(-\frac{(x-\mu_{fg})^2}{2\sigma^2_{fg}}\right)\exp\left(\frac{(\mu_f-\mu_g)^2}{2(\sigma_f^2+\sigma_g^2)}\right)$$ This can be written as a scaled Gaussian PDF: $$f(x)g(x)=\frac{S_{fg}}{\sigma_{fg}\sqrt{2\pi}}\exp\left(-\frac{(x-\mu_{fg})^2}{2\sigma_{fg}^2}\right)$$ where $$ S_{fg}=\frac{1}{\sqrt{2\pi(\sigma_f^2+\sigma_g^2)}}\exp\left(-\frac{(\mu_f-\mu_g)^2}{2(\sigma_f^2+\sigma_g^2)}\right) $$

Note that the scaling constant is also a Gaussian function of the two means and two variances.

The product of two Gaussian densities is Gaussian, and the Gaussian is a member of the exponential family. Therefore, the Gaussian is conjugate prior to itself by the definition of conjugacy.

Sycorax
  • 90,934
  • 1
    The algebra hugely simplifies when you parameterize the Normal distributions in terms of the mean and the precision and work with the logarithm of the unnormalized densities. It is immediate that the log product is quadratic (it's a sum of quadratics) and one can find the associated precision and mean by inspection. That would reduce this post to two lines of algebra. – whuber Oct 09 '21 at 15:12
7

Because a comment of mine about obtaining a simple answer seems to have generated interest, here are the details.

Restatement of the question

The question asks whether the product of two Normal density functions determines a Normally distributed variable. In the notation of the question, these functions have the form

$$f(x; \mu, \sigma) = C(\mu,\sigma)\exp\left(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2\right)= C(\mu,\sigma)\exp\left(-\tau(\sigma)^2\left(x-\mu\right)^2\right)$$

where $C(\mu,\sigma)$ is the normalizing constant (a number determined by the need to make $f(x;\mu,\sigma)\,\mathrm{d}x$ integrate to unity) and $$\tau(\sigma)^2 = \frac{1}{2\sigma^2}.$$

$2\tau(\sigma)^2$ (the reciprocal of the variance) is known as the precision.

Use of the logarithm to simplify the analysis

Because $f$ is always positive, we may work with its logarithm, which is a quadratic function of $x:$

$$\log f(x;\mu,\sigma) = A(\mu,\sigma) - \tau(\sigma)^2(x-\mu)^2\tag{*}$$

(where, evidently, $A(\mu,\sigma) = \log(C(\mu,\sigma))$).

Notice that this expression describes all nondegenerate quadratic functions of $x$ with negative leading coefficient. That is, given any quadratic $Q(x) = -ax^2 + 2bx + c,$ we may find $\mu,$ $\sigma,$ and a constant (to play the role of $A(\mu,\sigma)$) in which $Q$ is expressed in the form $(*).$ Finding $\mu$ and $\sigma$ given $a,b,c$ is called completing the square. However, the details will not matter here, so I leave it to the interested reader to work out the formulas (which is a straightforward exercise in elementary algebra).

Conversely (by definition of Normal distributions), any distribution with a log density function that can be written in this form (and is defined for all real numbers) is a Normal distribution. Let's memorialize this characterization by highlighting it:

Any density function $f$ that is (a) defined for all real numbers and (b) whose logarithm is a quadratic function of its argument describes a Normal distribution.

Solution

Recall that the logarithm of a product is the sum of the logarithms. Thus, the question comes down to this:

Is the sum of two quadratic functions quadratic?

Trivially, yes, because by the rules of polynomial addition,

$$(-a_1 x^2 + 2b_1 x + c_1) + (-a_2 x^2 + 2b_2 x + c_2) = -(a_1+a_2)x^2 + 2(b_1+b_2)x + (c_1+c_2),$$

QED.

We can go further, though: it is of interest to identify which Normal distribution occurs. For this, the notation of the question will be convenient. The preceding calculation is now written

$$\begin{aligned} \left(A(\mu_x,\sigma_x)-\tau(\sigma_x)^2 (x-\mu_x)^2\right) + \left(A(\mu_y,\sigma_y)-\tau(\sigma_y)^2 (x-\mu_y)^2\right) \\ = A(\mu,\sigma)-\tau(\sigma)^2 (x-\mu)^2 \end{aligned}$$

where $\sigma^2$ is the variance of the result, $\mu$ is its mean, and $A(\mu,\sigma)$ is the logarithm of its normalizing constant.

My point is that we can solve this problem by inspection. This is a math-speak term for saying you don't have to write anything down because you can pick out appropriate polynomial coefficients just by looking. To wit,

  1. The coefficient of $x^2$ must be the sum of its coefficients on the left hand side, giving $$\tau(\sigma)^2 = \tau(\sigma_x)^2 + \tau(\sigma_y)^2.\tag{1}$$

  2. The coefficient of $x$ must be the sum of its coefficients on the left hand side. This requires slightly greater perception: namely, recognizing that the coefficient of $x$ in the square $(x-\mu)^2$ is $-2\mu.$ Thus, $$2\tau(\sigma)^2 \mu = 2\tau(\sigma_x)^2\mu_x +2\tau(\sigma_y)^2\mu_y.$$

Here, then, is the second place where we actually have to do some algebra: solve this equation for $\mu.$ Again, the solution is by inspection (because the equation is so simple), and we can simplify it using $(1)$ above:

$$\mu = \frac{2\tau(\sigma_x)^2\mu_x + 2\tau(\sigma_y)^2\mu_y}{2\tau(\sigma)^2} = \frac{2\tau(\sigma_x)^2\mu_x + 2\tau(\sigma_y)^2\mu_y}{2\tau(\sigma_x)^2 + 2\tau(\sigma_y)^2}.\tag{2}$$

The factors of $2\tau(\ )^2$ are the precisions of the distributions (q.v.), enabling us to characterize the results $(1)$ and $(2)$ in a simple, memorable fashion:

When multiplying two Normal densities, precisions add (just double both sides of equation $(1)$) and the mean is the precision-weighted average of the means (equation $(2)$).


The two highlighted equations--the first simplifying the sum of quadratics and the second solving a simple linear equation in one unknown--constitute the "two lines of algebra" I mentioned in my comment.

whuber
  • 322,774