How can we prove that a draw $(x,y)$ from the joint probability distribution of two random variables $X$ and $Y$ can be obtained by first generating a draw $x$ from the marginal probability
distribution of $X$ and then a draw $y$ from the conditional probability distribution of $Y$ given $X=x$?
Is this a simple consequence of the definition of a conditional probability distribution?
- 105,342
- 5,950
-
4This is a consequence of the law of total probability. – Stéphane Laurent Jul 13 '22 at 12:22
1 Answers
In short, drawing from the joint distribution is equivalent to first drawing from one marginal and second from the corresponding conditional.
To answer a side question from @statmerkur in the comments, drawing from a distribution $F$ means producing a realisation of a random variable with distribution $F$, either by inducing the phenomenon distributed as $F$ (e.g., the time to an electron emission) or by simulating it (e.g., taking $F^{-1}(U)$ with $U$ a uniform variate when the distribution is univariate).
Considering that$$\mathbb P(X\in A,\ Y\in B)=\mathbb P(X\in A)\mathbb P(Y\in B|X\in A)$$ $-$where $\mathbb P(\cdot)$ denotes the probability of the event between parentheses$-$, we have \begin{align} \mathbb P\{X\in (\epsilon,\epsilon+\text d\epsilon),\ Y\in (\eta,\eta+\text d\eta)\}&=\int_{\epsilon}^{\epsilon+\text d\epsilon} \int_{\eta}^{\eta+\text d\eta} f_{Y|X}(y|x)\,\text dy\,f_X(x)\,\text dx\\ &=\mathbb E^X[\mathbb I_{(\epsilon,\epsilon+\text d\epsilon)}(X)\times\mathbb E^{Y|X}\{\mathbb I_{(\eta,\eta+\text d\eta)}(Y)|X\}] \end{align} where [in reply to @singlemalt]
- $f_{Y|X}(\cdot|\cdot)$ denotes the probability density of the conditional distribution of $Y$ given $X$
- $\mathbb I_A(X)$ is the indicator function that $X\in A$, equal to either 0 or 1, with $$\mathbb P(X\in A)=\mathbb E^X[\mathbb I_A(X)]$$
- $\mathbb E$ is a blackboard-bold "E" that is often used for representing `expectations'
- the superscript $X$ is indicating that the expectation is wrt the distribution of the rv $X$,
- and the superscript $Y|X$ is indicating that the expectation is computed wrt the conditional distribution of the rv $Y$ given the rv $X$, meaning that $X$ is considered as fixed when computing this conditional expectation.
Therefore, if [one simulates] $X\sim f_X(x)$, with realisation $x$, and [then one simulates] $Y\sim f_{Y|X}(y|x)$, $$\mathbb I_{(\epsilon,\epsilon+\text d\epsilon)}(X)\times\mathbb I_{(\eta,\eta+\text d\eta)}(Y)$$ is an unbiased estimator of $$\mathbb E^{(X,Y)}[\mathbb I_{(\epsilon,\epsilon+\text d\epsilon)}(X)\times\mathbb I_{(\eta,\eta+\text d\eta)}(Y)]=\mathbb E^X[\mathbb I_{(\epsilon,\epsilon+\text d\epsilon)}(X)\times\mathbb E^{Y|X}\{\mathbb I_{(\eta,\eta+\text d\eta)}(Y)|X\}]$$ for all $(\epsilon,\eta)$. Letting $\text d\epsilon$ and $\text d\eta$ go to zero (0) allows one to conclude that the joint distribution of $(X,Y)$ is obtained through this decomposition marginal-then-conditional, thus that the simulation is producing a realisation from the correct (joint) distribution.
Take the example of a bivariate Normal
$$(X,Y)^\prime\sim\mathcal N_2(0_2,\Sigma)\qquad
\Sigma=\left(\begin{matrix}1 &\rho\cr\rho &1\end{matrix} \right)\qquad \rho\in(-1,1)$$
Then drawing $X$ from the marginal $\mathcal N(0,1)$ is equivalent to setting $X=\epsilon_x$ with $\epsilon_x\sim \mathcal N(0,1)$. Further, drawing $Y$ from the conditional $\mathcal N(\rho x,1-\rho^2)$ is equivalent to setting $$Y=\rho x +\sqrt{1-\rho^2}\epsilon_y\qquad\epsilon_y\sim\mathcal N(0,1)$$
Therefore
$$
\left(\begin{matrix}X\cr Y\cr\end{matrix} \right)=
\overbrace{\left(\begin{matrix}1 &0\cr \rho &\sqrt{1-\rho^2}\cr\end{matrix} \right)}^A
\left(\begin{matrix}\epsilon_x\cr \epsilon_y\cr\end{matrix} \right)
\sim \mathcal N_2(0_2,\Sigma)$$
since
$$A A^\mathsf{T}=\Sigma$$
This implies that the draw is truly from the joint distribution.
- 5,950
- 105,342
-
Xi'an, what is the superscript $X$ on the exponential symbol? Does the thin hollow rectangle symbol represent identity? – Single Malt Aug 02 '22 at 16:18