How can I sample a multivariate normal vector that satisfies a linear equality constraint?

Question

Let $X \sim N_n(\mu, \Sigma)$, such that $AX=b$ where $A$ is a ($p \times n$) matrix, with $p \ll n$. How can I efficiently sample from this distribution?

I've seen techniques using elliptical slice sampling, but that seems to work only if the constraint is an inequality.

I realize I would be sampling from a $n-p$ subspace, that is fine.

So then, if I understand correctly, I would need to find some projection matrix P such that it projects any vector to to the "plane" Ax=b. Then I can sample y ~ N (P^-1 mu, P^-1 * Sigma * P^-T). Then x = P * y would be a sample from such a distribution? — J. Zeitouni, Jun 02 '23 at 11:47
I am not sure if such sample would exist, because by probability theory $P[AX = b] = 0$ for $X \sim N(\mu, \Sigma)$. — Zhanxiong, Jun 02 '23 at 13:09
@Zhanxiong. This is definitely true given the base measure, and measure theory is not my forte, but the disintegration theorem seems to suggest that there exists some lower dimensional measure where this probability is well-defined.
I imagine the solution would involve sampling some (n - p)-dimensional vector that probably has a normal distribution, and then apply a linear transformation to it to retrieve my desired vector. — J. Zeitouni, Jun 02 '23 at 15:25
@J.Zeitouni Sampling from a linear subspace is different from sampling something exactly meets the equality constraint. The theorem you cited seems to be an overkill, and it does not need any measure theory to wit $P[AX = b] = 0$ -- all you need is to recognize $AX \sim N_p(A\mu, A\Sigma A')$ is a continuous random vector hence the probability of it equals to any given vector must be $0$. — Zhanxiong, Jun 02 '23 at 16:20
You need to find the matrix $P$ such that $Px$ is orthogonal to $Ax$, then apply the rule that $PX\sim\mathcal N(P\mu,P\Sigma P')$. — Xi'an, Jun 02 '23 at 16:25
From a pure linear algebra pov, first of all, the linear system $Ax = b$ does not necessarily to have any solution for a given $b$. When it does, it lives in a affine space of dimensionality $p - \operatorname{rank}(A)$. On the other hand, any $N_n(\mu, \Sigma)$ random vector almost surely lives in a space of dimensionality $n \gg p$. That's why this question looks kind of paradoxical to me. — Zhanxiong, Jun 02 '23 at 16:33
Do you want the conditional distribution of X, given AX = b? And then sample from that distribution? — David Thiessen, Jun 02 '23 at 19:33
@David what would that look like? I know how to condition given that some elements of a vector are fixed, but not given that a linear combination of elements is fixed. — J. Zeitouni, Jun 02 '23 at 20:03
Compute the covariance matrix and apply the solution at https://stats.stackexchange.com/questions/63817. — whuber, Jun 21 '23 at 18:57

score 1 · Accepted Answer · answered Jun 21 '23 at 16:18

Our approach will be to extend the matrix $A$ to a full rank $n\times n$ matrix $C = \begin{pmatrix}P \\ A \end{pmatrix}$, find the conditional distribution of $PX$ given $AX = b$, sample $Px$, then backtransform to find $x$.

Start by extending $A$ to a full rank matrix by adding $n-p$ of the standard n-dimensional basis vectors as rows to the top. This is always possible, https://math.stackexchange.com/questions/4149832. This creates the matrix $C = \begin{pmatrix}P \\ A \end{pmatrix}$ where $P$ is an $(n-p) \times n$ rectangualr matrix of $1$s and $0$s.

Consider $CX$ = $\begin{pmatrix}PX \\ AX \end{pmatrix}$ which is jointly normally distributed with mean $\nu = \begin{pmatrix}\nu_1 \\ \nu_2 \end{pmatrix} = C\mu$ and variance $\Omega = \begin{pmatrix}\Omega_{11} & \Omega_{12} \\ \Omega_{21} & \Omega_{22} \end{pmatrix} = C \Sigma C^T$.

From Deriving the conditional distributions of a multivariate normal distribution the conditional distribution of $PX$ given $AX = b$ is multivariate normal with mean $\mu_{PX|AX} = \nu_1 + \Omega_{12} \Omega_{22}^{-1}(b - \nu_2)$ and variance $\Sigma_{PX|AX} = \Omega_{11} - \Omega_{12}\Omega_{22}^{-1}\Omega_{21}$

Sample from this distribution using standard methods to find $Px$.

We now have $Cx = \begin{pmatrix} Px \\ b \end{pmatrix}$. Because $C$ is full rank it is invertible. We can find $x$ by $x = C^{-1}\begin{pmatrix} Px \\ b \end{pmatrix}$. Giving us the desired sample.

Below is a short R program demonstrating the approach.

library(MASS)
# Set constants
n <- 4
p <- 2
mu <- c(0, 1, 2, 3)
sigma <- matrix(data = c(1, 0, 0, 0,
                         0, 1, 0, 0,
                         0, 0, 2, 0,
                         0, 0, 0, 2),
                nrow = n, ncol =  n)
A <- matrix(data = c(1, 0, 0, 0,
                     0, 1, 0.25, -0.5),
            nrow = p, ncol = n, byrow = TRUE)
b <- c(0.5, 1)
num_sim <- 100
Extend the matrix A by adding standard basis vectors as rows
This is a bit inefficient programming, but it is an example
C <- A
diag_mat <- diag(x = 1, nrow = n)
while(qr(C)$rank < n) {
  for(i in 1:n) {
    if(qr(rbind(diag_mat[i,], C))$rank > qr(C)$rank) {
      C <- rbind(diag_mat[i,], C)
      break;
  }}}
Find the conditional mean and variance of PX given AX = b
mu_CX <- C %% mu
var_CX <- C %% sigma %% t(C)
var_CX_11 <- var_CX[1:(n-p), 1:(n-p)]
var_CX_12 <- var_CX[1:(n-p), (n-p+1):n]
var_CX_21 <- var_CX[(n-p+1):n, 1:(n-p)]
var_CX_22 <- var_CX[(n-p+1):n, (n-p+1):n]
conditionalmean <- mu_CX[1:(n-p),, drop=FALSE] + 
  (var_CX_12 %% solve(var_CX_22) %% (b - mu_CX[(n-p+1):n,, drop=FALSE]))
conditionalvar <- var_CX_11 - (var_CX_12 %% solve(var_CX_22) %*% var_CX_21)
generate data from the conditional distribution of PX given AX = b
px <- mvrnorm(n = num_sim, mu = conditionalmean, Sigma = conditionalvar)
convert to values of X
xmat <- matrix(nrow = num_sim, ncol = n)
for(i in 1:num_sim) {
  xmat[i,] <- solve(C) %*% rbind(t(px[i,, drop=FALSE]), t(t(b)))
}
View a few results
head(xmat)
#>      [,1]      [,2]      [,3]     [,4]
#> [1,]  0.5 1.4323802 2.4850736 2.107297
#> [2,]  0.5 2.6597197 0.3644071 3.501643
#> [3,]  0.5 2.9192197 2.3412025 5.009041
#> [4,]  0.5 1.7250687 0.7885823 1.844429
#> [5,]  0.5 1.4179704 1.2767347 1.474308
#> [6,]  0.5 0.7696248 3.2404795 1.159489
A %% xmat[1,]
#>      [,1]
#> [1,]  0.5
#> [2,]  1.0
A %% xmat[2,]
#>      [,1]
#> [1,]  0.5
#> [2,]  1.0

^{Created on 2023-06-21 with reprex v2.0.2}

How can I sample a multivariate normal vector that satisfies a linear equality constraint?

1 Answers1

Extend the matrix A by adding standard basis vectors as rows

This is a bit inefficient programming, but it is an example

Find the conditional mean and variance of PX given AX = b

generate data from the conditional distribution of PX given AX = b

convert to values of X

View a few results