2

Let $X \sim N_n(\mu, \Sigma)$, such that $AX=b$ where $A$ is a ($p \times n$) matrix, with $p \ll n$. How can I efficiently sample from this distribution?

I've seen techniques using elliptical slice sampling, but that seems to work only if the constraint is an inequality.

I realize I would be sampling from a $n-p$ subspace, that is fine.

  • 1
    Linear projection preserves the Normal distribution. – Xi'an Jun 02 '23 at 11:07
  • So then, if I understand correctly, I would need to find some projection matrix P such that it projects any vector to to the "plane" Ax=b. Then I can sample y ~ N (P^-1 mu, P^-1 * Sigma * P^-T). Then x = P * y would be a sample from such a distribution? – J. Zeitouni Jun 02 '23 at 11:47
  • 1
    I am not sure if such sample would exist, because by probability theory $P[AX = b] = 0$ for $X \sim N(\mu, \Sigma)$. – Zhanxiong Jun 02 '23 at 13:09
  • @Zhanxiong. This is definitely true given the base measure, and measure theory is not my forte, but the disintegration theorem seems to suggest that there exists some lower dimensional measure where this probability is well-defined.

    I imagine the solution would involve sampling some (n - p)-dimensional vector that probably has a normal distribution, and then apply a linear transformation to it to retrieve my desired vector.

    – J. Zeitouni Jun 02 '23 at 15:25
  • 1
    @J.Zeitouni Sampling from a linear subspace is different from sampling something exactly meets the equality constraint. The theorem you cited seems to be an overkill, and it does not need any measure theory to wit $P[AX = b] = 0$ -- all you need is to recognize $AX \sim N_p(A\mu, A\Sigma A')$ is a continuous random vector hence the probability of it equals to any given vector must be $0$. – Zhanxiong Jun 02 '23 at 16:20
  • You need to find the matrix $P$ such that $Px$ is orthogonal to $Ax$, then apply the rule that $PX\sim\mathcal N(P\mu,P\Sigma P')$. – Xi'an Jun 02 '23 at 16:25
  • From a pure linear algebra pov, first of all, the linear system $Ax = b$ does not necessarily to have any solution for a given $b$. When it does, it lives in a affine space of dimensionality $p - \operatorname{rank}(A)$. On the other hand, any $N_n(\mu, \Sigma)$ random vector almost surely lives in a space of dimensionality $n \gg p$. That's why this question looks kind of paradoxical to me. – Zhanxiong Jun 02 '23 at 16:33
  • Do you want the conditional distribution of X, given AX = b? And then sample from that distribution? – David Thiessen Jun 02 '23 at 19:33
  • @David what would that look like? I know how to condition given that some elements of a vector are fixed, but not given that a linear combination of elements is fixed. – J. Zeitouni Jun 02 '23 at 20:03
  • Compute the covariance matrix and apply the solution at https://stats.stackexchange.com/questions/63817. – whuber Jun 21 '23 at 18:57

1 Answers1

1

Our approach will be to extend the matrix $A$ to a full rank $n\times n$ matrix $C = \begin{pmatrix}P \\ A \end{pmatrix}$, find the conditional distribution of $PX$ given $AX = b$, sample $Px$, then backtransform to find $x$.

Start by extending $A$ to a full rank matrix by adding $n-p$ of the standard n-dimensional basis vectors as rows to the top. This is always possible, https://math.stackexchange.com/questions/4149832. This creates the matrix $C = \begin{pmatrix}P \\ A \end{pmatrix}$ where $P$ is an $(n-p) \times n$ rectangualr matrix of $1$s and $0$s.

Consider $CX$ = $\begin{pmatrix}PX \\ AX \end{pmatrix}$ which is jointly normally distributed with mean $\nu = \begin{pmatrix}\nu_1 \\ \nu_2 \end{pmatrix} = C\mu$ and variance $\Omega = \begin{pmatrix}\Omega_{11} & \Omega_{12} \\ \Omega_{21} & \Omega_{22} \end{pmatrix} = C \Sigma C^T$.

From Deriving the conditional distributions of a multivariate normal distribution the conditional distribution of $PX$ given $AX = b$ is multivariate normal with mean $\mu_{PX|AX} = \nu_1 + \Omega_{12} \Omega_{22}^{-1}(b - \nu_2)$ and variance $\Sigma_{PX|AX} = \Omega_{11} - \Omega_{12}\Omega_{22}^{-1}\Omega_{21}$

Sample from this distribution using standard methods to find $Px$.

We now have $Cx = \begin{pmatrix} Px \\ b \end{pmatrix}$. Because $C$ is full rank it is invertible. We can find $x$ by $x = C^{-1}\begin{pmatrix} Px \\ b \end{pmatrix}$. Giving us the desired sample.

Below is a short R program demonstrating the approach.

library(MASS)
# Set constants
n <- 4
p <- 2
mu <- c(0, 1, 2, 3)
sigma <- matrix(data = c(1, 0, 0, 0,
                         0, 1, 0, 0,
                         0, 0, 2, 0,
                         0, 0, 0, 2),
                nrow = n, ncol =  n)
A <- matrix(data = c(1, 0, 0, 0,
                     0, 1, 0.25, -0.5),
            nrow = p, ncol = n, byrow = TRUE)
b <- c(0.5, 1)
num_sim <- 100

Extend the matrix A by adding standard basis vectors as rows

This is a bit inefficient programming, but it is an example

C <- A diag_mat <- diag(x = 1, nrow = n) while(qr(C)$rank < n) { for(i in 1:n) { if(qr(rbind(diag_mat[i,], C))$rank > qr(C)$rank) { C <- rbind(diag_mat[i,], C) break; }}}

Find the conditional mean and variance of PX given AX = b

mu_CX <- C %% mu var_CX <- C %% sigma %% t(C) var_CX_11 <- var_CX[1:(n-p), 1:(n-p)] var_CX_12 <- var_CX[1:(n-p), (n-p+1):n] var_CX_21 <- var_CX[(n-p+1):n, 1:(n-p)] var_CX_22 <- var_CX[(n-p+1):n, (n-p+1):n] conditionalmean <- mu_CX[1:(n-p),, drop=FALSE] + (var_CX_12 %% solve(var_CX_22) %% (b - mu_CX[(n-p+1):n,, drop=FALSE])) conditionalvar <- var_CX_11 - (var_CX_12 %% solve(var_CX_22) %*% var_CX_21)

generate data from the conditional distribution of PX given AX = b

px <- mvrnorm(n = num_sim, mu = conditionalmean, Sigma = conditionalvar)

convert to values of X

xmat <- matrix(nrow = num_sim, ncol = n) for(i in 1:num_sim) { xmat[i,] <- solve(C) %*% rbind(t(px[i,, drop=FALSE]), t(t(b))) }

View a few results

head(xmat) #> [,1] [,2] [,3] [,4] #> [1,] 0.5 1.4323802 2.4850736 2.107297 #> [2,] 0.5 2.6597197 0.3644071 3.501643 #> [3,] 0.5 2.9192197 2.3412025 5.009041 #> [4,] 0.5 1.7250687 0.7885823 1.844429 #> [5,] 0.5 1.4179704 1.2767347 1.474308 #> [6,] 0.5 0.7696248 3.2404795 1.159489 A %% xmat[1,] #> [,1] #> [1,] 0.5 #> [2,] 1.0 A %% xmat[2,] #> [,1] #> [1,] 0.5 #> [2,] 1.0

Created on 2023-06-21 with reprex v2.0.2