Conditional Distribution of Multivariate Gaussian given Linear Inequalities

Question

Consider a multivariate Gaussian $Y\sim\mathcal{N}(\mu,\Sigma)$ of dimension $n$. For fixed $c\in\mathbb{R}^n, A\in\mathbb{R}^{m\times n}$ and $c\in\mathbb{R^m}$, what is the conditional distribution of $c'Y$ given $AY\geq b$ (where the inequalities are componentwise), i.e. what is the distribution of \begin{align*} c'Y|AY\geq b? \end{align*} My thoughts:

If $m=1$ and $A=c'$, then we would be looking at a Gaussian with mean $c'\mu$ and variance $c'\Sigma c$ truncated from below at $b$.
(c,A)Y is jointly Gaussian with known mean and variance and hence known density. Denote this joint distribution by $W\sim\mathcal{N}(\tilde{\mu},\tilde{\Sigma})$ for some $\tilde{\mu},\tilde{\Sigma}$. Then we have to integrate the distribution of $W$ on the set $\{w|w_2\geq b_2,\ldots,w_m\geq b_m\}$ over $w_2,\ldots,w_m$. Looking at Ben's answer from this post, it appears that one factorize the joint density \begin{align*} \int\limits_{b_{-1}}^{\infty^{m-1}}f_W(w)\text{d}w_2\ldots \text{d}w_m \sim& \int\limits_{b_{-1}}^{\infty^{m-1}} \exp(-(w-\tilde{\mu})^t\tilde{\Sigma}(w-\tilde{\mu})) \text{d}w_2\ldots \text{d}w_m\\ \sim& \exp(-(w_1-\mu_1^*)^t\Sigma_1^*(w_1-\mu_1^*)) \\ &\int\limits_{b_{-1}}^{\infty^{m-1}} \exp(-((w_2,\ldots,w_m)-\mu_2^*)^t\Sigma_2^*((w_2,\ldots,w_m)-\mu_2^*)) \text{d}w_2\ldots \text{d}w_m \end{align*} for some $\mu^1,\mu^2,\Sigma_1^*$ and $\Sigma_2^*$ (see Ben's post for the exact expressions). But now, $\int\limits_{0^{m-1}}^{\infty^{m-1}} \exp(-((w_2,\ldots,w_m)-\mu_2^*)^t\Sigma_2^*((w_2,\ldots,w_m)-\mu_2^*)) \text{d}w_2\ldots \text{d}w_m$ is just some number, independent of $w_1$. Hence it appears that the conditional distribution of $W_1|W_2\geq b_2,\ldots, W_m\geq b_m$ is a Gaussian with some mean and variance. But in a simulation, this does not at all appear to be the case. I made a simulation of the desired conditional distribution and it does not look Gaussian at all. Here is the histogram generated with the following reproducible R code:

set.seed(1)
n     <- 1000000
sigma <- 1
n_y   <- 3 
A     <- matrix(runif(n_y*(n_y-1)), nrow = n_y, ncol = n_y-1)
b     <- matrix(runif(n_y-1),ncol=1)
c     <- matrix(1:n_y,       ncol=1)
y     <- 0.5 + sigmamatrix(rnorm(nn_y),nrow = n)
rel = (y%% A[,1] >= b[1]) & (y%% A[,2] >= b[2]) 
hist((y%*% c)[rel], breaks = 100, main = "Histogram of y'c|Ax>=b", xlab = "y'c|Ax>=b")

What went wrong? Note that cor(y%*% A[,1], y%*%c)==0.9949304 is very high, so that this should be relatively close to the truncated Gaussian from the first thought.

Conditional Distribution of Multivariate Gaussian given Linear Inequalities

0 Answers0