Suppose $X \mid Y$ and $Y$ are normally distributed. Does it follow that $Y \mid X$ is normally distributed?
And if so, how would one prove this?
Suppose $X \mid Y$ and $Y$ are normally distributed. Does it follow that $Y \mid X$ is normally distributed?
And if so, how would one prove this?
Here, as a counterexample, is a sample of points where $Y$ has a standard Normal distribution and the conditional distribution of $X$ is always Normal, too.
This is obviously non-Normal.
To guarantee Normality, you need (almost surely) that
(1) $E[X\mid Y]$ must be a linear function of $Y$ and
(2) $\operatorname{Var}(X\mid Y)$ must be constant.
These are both characteristics of any Bivariate Normal distribution, so they are necessary conditions. When you write down the joint distribution implied by both these conditions, it will be Gaussian: that is, Bivariate Normal.
These R commands generated the example. The conditional variance $Y^4$ is not constant.
n <- 1e3
y <- rnorm(n)
x <- rnorm(n, y, y^2)
plot(x,y, col = "#00000040") # Semi-transparent points
Apologies if this is a trivial question; I'm new to this level of math and not very good.
– user1141170 Jan 18 '23 at 23:16whuber shows, by means of a counterexample, that the product of a Gaussian $X|Y$ times a Gaussian r.v. $Y$ does not necessarily lead to a joint Gaussian distribution, which in this case certainly doesn't have a conditional Gaussian.
Here is a (numerical) counter-example for your claim. Let $Y|X=x \sim N(x^3, 1)$ and $X \sim N(0,1)$. Then $(X,Y)$ is not jointly normal. The marginal of $Y$ is
$$ f_Y(y) = \frac{1}{2\pi}\int_{-\infty}^{\infty} e^{-\frac{1}{2}(y - x^3)^2 - \frac{1}{2}x^2}dx. $$
Unfortunately this integral (as far as I know) cannot be computed analytically, but we can provide a fairly good approximation via adaptive quadratures. Thus the conditional we require is
$$ f_{X|Y} = \frac{e^{-\frac{1}{2}(y - x^3)^2 - \frac{1}{2}x^2}}{\int_{-\infty}^{\infty} e^{-\frac{1}{2}(y - x^3)^2 - \frac{1}{2}x^2}dx}. $$
The result is disproved provided we are able to show that this conditional is not Gaussian. Toward this aim, let's fix $y = -1$. An elegant proof may be given by studying the properties of this function but here I am taking a brute-force approach in which I approximate this beast numerically. If the conditional is gaussian we expect it to be unimodal and symmetric. This is the contradiction I'll be looking for.
# propto the joint distribution
f_yx <- function(y, x) {
exp(-0.5*(y-x^3)^2 - 0.5*x^2)
}
the marginal of Y
f_y <- function(y) {
integrate(function(t) f_yx(y, t), lower = -Inf, Inf)$value
}
the conditional of x given y
x_given_y = function(x, y) f_yx(y,x)/f_y(y)
fix y
y = -1
x <- seq(-3, 3, len=100)
cond_y <- sapply(x, x_given_y, y=y)
plot(x, cond_y, type ="l", lwd=2)
If we can trust numerical integration (I'm using the integrate function here which is extremely robust!), we can see from the plot that this conditional density is definitely non-Gaussian. This contradicts the claim.
Side comment. There exist non-Gaussian bivariate distributions which have Gaussian conditional densities. One example of this is
$$ f(x,y) = C\exp\left(-(1+x^2)(1+y^2)\right),\quad -\infty <x,y<\infty,$$
where $C$ is the normalising constant. You can check that both $Y|X=x$ and $X|Y=x$ are Gaussian with suitable parameters.
` por withn latex ${\mathrm e}^{-\frac{x^{2}}{2}} \sqrt{2}, \sqrt{\pi}$
– kjetil b halvorsen Jan 22 '23 at 00:36Here I will augment the excellent answer by whuber by showing the mathematical form of your general model and the sufficient conditions that imply a normal distribution for $Y|X$. Consider the general hierarchical model form:
$$\begin{align} X|Y=y &\sim \text{N}(\mu(y),\sigma^2(y)), \\[6pt] Y &\sim \text{N}(\mu_*,\sigma^2_*). \\[6pt] \end{align}$$
This model gives the joint density kernel:
$$\begin{align} f_{X,Y}(x,y) &= f_{X|Y}(x|y) f_{Y}(y) \\[12pt] &\propto \frac{1}{\sigma(y)} \cdot \exp \Bigg( -\frac{1}{2} \Big( \frac{x-\mu(y)}{\sigma(y)} \Big)^2 \Bigg) \exp \Bigg( -\frac{1}{2} \Big( \frac{y-\mu_*}{\sigma_*} \Big)^2 \Bigg) \\[6pt] &= \frac{1}{\sigma(y)} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \Big( \frac{x-\mu(y)}{\sigma(y)} \Big)^2 + \Big( \frac{y-\mu_*}{\sigma_*} \Big)^2 \Bigg] \Bigg) \\[6pt] &= \frac{1}{\sigma(y)} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \frac{(x-\mu(y))^2 \sigma_*^2 + (y-\mu_*)^2 \sigma(y)^2}{\sigma(y)^2 \sigma_*^2} \Bigg] \Bigg) \\[6pt] &\overset{y}{\propto} \frac{1}{\sigma(y)} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \frac{(\mu(y)^2 - 2x \mu(y)) \sigma_*^2 + (y^2-2y\mu_* + \mu_*^2) \sigma(y)^2}{\sigma(y)^2 \sigma_*^2} \Bigg] \Bigg), \\[6pt] \end{align}$$
which gives the conditional density kernel:
$$\begin{align} f_{Y|X}(y|x) &\overset{y}{\propto} \frac{1}{\sigma(y)} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \frac{(\mu(y)^2 - 2x \mu(y)) \sigma_*^2 + (y^2-2y\mu_* + \mu_*^2) \sigma(y)^2}{\sigma(y)^2 \sigma_*^2} \Bigg] \Bigg). \\[6pt] \end{align}$$
In general, this is not the form of a normal density. However, suppose we impose the following conditions on the condtional mean and variance of $X|Y$:
$$\mu(y) = a + by \quad \quad \quad \quad \quad \sigma^2(y) = \sigma^2.$$
These conditions mean that we require $\mu(y) \equiv \mathbb{E}(X|Y=y)$ to be an affine function of $y$ and we require $\sigma^2(y) \equiv \mathbb{V}(X|Y=y)$ to be a fixed value. Incorporating these conditions gives:
$$\begin{align} f_{Y|X}(y|x) &\overset{y}{\propto} \frac{1}{\sigma(y)} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \frac{(\mu(y)^2 - 2x \mu(y)) \sigma_*^2 + (y^2-2y\mu_* + \mu_*^2) \sigma(y)^2}{\sigma(y)^2 \sigma_*^2} \Bigg] \Bigg) \\[6pt] &= \frac{1}{\sigma} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \frac{((a + by)^2 - 2x (a + by)) \sigma_*^2 + (y^2-2y\mu_* + \mu_*^2) \sigma^2}{\sigma^2 \sigma_*^2} \Bigg] \Bigg) \\[6pt] &= \frac{1}{\sigma} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \frac{(b^2 y^2 + 2ab y + a^2 b^2 - 2xa - 2xb y) \sigma_*^2 + (y^2-2y\mu_* + \mu_*^2) \sigma^2}{\sigma^2 \sigma_*^2} \Bigg] \Bigg) \\[6pt] &\overset{y}{\propto} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \frac{(\sigma^2 + b^2 \sigma_*^2 ) y^2 + 2(b(a - x) \sigma_*^2 - \mu_* \sigma^2) y}{\sigma^2 \sigma_*^2} \Bigg] \Bigg) \\[6pt] &\overset{y}{\propto} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \frac{y^2 + 2[(b(a - x) \sigma_*^2 - \mu_* \sigma^2)/(\sigma^2 + b^2 \sigma_*^2) ] y}{\sigma^2 \sigma_*^2/(\sigma^2 + b^2 \sigma_*^2 ) } \Bigg] \Bigg) \\[6pt] &\overset{y}{\propto} \cdot \exp \Bigg( -\frac{1}{2} \Bigg[ \frac{1}{\sigma^2 \sigma_*^2/(\sigma^2 + b^2 \sigma_*^2 )} \cdot \Big( y - \frac{b(a - x) \sigma_*^2 - \mu_* \sigma^2}{\sigma^2 + b^2 \sigma_*^2} \Big)^2 \Bigg] \Bigg) \\[6pt] &\overset{y}{\propto} \text{N} \Bigg( y \Bigg| \frac{b(a - x) \sigma_*^2 - \mu_* \sigma^2}{\sigma^2 + b^2 \sigma_*^2}, \frac{\sigma^2 \sigma_*^2}{\sigma^2 + b^2 \sigma_*^2} \Bigg). \\[6pt] \end{align}$$
Here we see that we have a normal distribution for $Y|X$ which confirms that the above conditions on the conditional mean and variance of $X|Y$ are sufficient to give this property.
Here is another counterexample which gives closed-form distributions of $X, Y, X|Y = y$ and $Y|X = x$.
Let $Y, Z \text{ i.i.d. } \sim N(0, 1)$, and define $X = \frac{Z}{Y}$. Then for $y \neq 0$ (the probability of $Y = 0$ is zero), \begin{align} X | Y = y \sim N(0, y^{-2}). \end{align}
On the other hand, it is well known that the marginal distribution of $X$ is Cauchy distribution, i.e., \begin{align} f_X(x) = \frac{1}{\pi(1 + x^2)}, \quad x \in \mathbb{R}. \tag{1} \end{align} And the joint distribution of $X$ and $Y$ can be evaluated as (where $\Phi$ and $\varphi$ denote CDF and PDF of the standard normal distribution respectively): \begin{align} & F(x, y) = P[X \leq x, Y \leq y] \\ =& P[Z \leq Yx, Y \leq y, Y > 0] + P[Z \geq Yx, Y \leq y, Y < 0] \\ =& \begin{cases} \int_{-\infty}^y(1 - \Phi(tx))\varphi(t)dt & y < 0, \\[1em] \int_0^y\Phi(tx)\varphi(t)dt & y > 0. \end{cases} \end{align} Therefore, the joint density of $(X, Y)$ is given by \begin{align} & f(x, y) = \frac{\partial^2F(x, y)}{\partial x\partial y} \\ =&\begin{cases} -y\varphi(y)\varphi(yx) & y < 0, \\[1em] y\varphi(y)\varphi(yx) & y > 0 \end{cases} \\ =& \frac{1}{2\pi}|y|e^{-(1 + x^2)y^2/2}. \tag{2} \end{align}
$(1)$ and $(2)$ together yield the conditional density of $Y$ given $X = x$:
\begin{align}
f_{Y|X}(y|X = x) = \frac{f(x, y)}{f_X(x)} =
\frac{1}{2}|y|(1 + x^2)e^{-(1 + x^2)y^2/2}. \tag{3}
\end{align}
Obviously, $(3)$ is not the density of any normal distribution (with $y$ as the variate). Thus $Y | X = x$ is not normal. For example, when $x = 0$, $(3)$ looks like as follows:
P.S., PDF $(3)$ may be termed as "double generalized gamma distribution", based on these two articles: Generalized gamma distribution and Double Gamma Distribution. The parameters linked to the generalized gamma distribution are $a = \sqrt{2(1 + x^2)^{-1}}$ (scale) and $d = 2, p = 2$.