0

Assume I have some d-dimensional multivariate gaussian $X\sim\mathcal{N}\left(\mu,\Sigma\right)$

and some sphere $C=\left\{ x:\left\Vert x-z\right\Vert_2\le r\right\}\subseteq\mathbb{R}^{d}$.

I was wondering if there is a closed form expression that measures the probability mass inside the sphere: $$P\left(X\in C\right)=\int_{x\in C}\frac{1}{\sqrt{\left(2\pi\right)^{d}\left|\Sigma\right|}}\exp\left(-\frac{1}{2}\left(x-\mu\right)^{\top}\Sigma^{-1}\left(x-\mu\right)\right)dx$$

Some attempts of simpler cases:

  1. In the 1-dimensional case, the sphere $C$ boils down to a segment $\left[a,b\right]$. One can use the Error Function: $$P\left(X\in\left[a,b\right]\right)=\text{erf}\left(b\right)-\text{erf}\left(a\right)$$

  2. If $X\sim N\left(0,I\right)$ and $z=0$ (the sphere is centered around the origin) we have that: $$P\left(\left\Vert X \right\Vert _{2}<r\right)=P\left(\left\Vert X \right\Vert^2_2<r^{2}\right)=P\left(\sum_{i=1}^{d}X_{i}^{2}<r^{2}\right)$$ and we have that $\sum_{i}X_{i}^{2}\sim\chi_{d}^{2}$, so we can use the CDF of the Chi-Squared distribution.

What I wasn't able to solve yet:

What can I say if $\mu \ne z$? or when $\Sigma \ne c\cdot I$? Is there a closed form solution?

Some relevant links

What is the probability that a multivariate Normal RV lies within a sphere of radius R?

N-DIMENSIONAL CUMULATIVE FUNCTION, AND OTHER USEFUL FACTS ABOUT GAUSSIANS AND NORMAL DENSITIES

Thank you very much, this is my first question, so I hope it is clear.

Edit: If there are related cases that do have a closed form, I'd love to hear (e.g rectangle instead of a sphere?)

Avihu
  • 1
  • 1
    No, there is no closed-form expression — but there might be a more satisfying response to a question asking directly about whatever problems motivated this. – Matt F. Apr 17 '22 at 08:52
  • Thanks for the quick response! Do you have a reference for this fact? – Avihu Apr 17 '22 at 08:55
  • The motivation to this is mostly interest. Gaussians are interesting! – Avihu Apr 17 '22 at 08:58
  • 1
    No, there is no good reference; this sort of question doesn’t lend itself to good proofs, though the empirical evidence about similar questions on this site ought to be strong enough for anyone whose interest is just casual curiosity. – Matt F. Apr 17 '22 at 09:10

1 Answers1

2

It depends on what you would call a closed-form expression. I say this because the Gaussian and $\chi^2$ CDFs are already non-elementary functions. However, we have deterministic methods that allow us to compute them to arbitrary precision for any parameter value. In this latter sense, there is indeed a closed-form solution for the most general problem you consider.

In fact, the problem you pose is equivalent (through a change of variables) to finding the Gaussian mass of a hyperellipsoid in general position. To see this, let

  • $X \sim \mathcal{N}(\mu, \Sigma)$ be an $n$-dimensional Gaussian random variable
  • $Q(x) = (x - c)^\top \Lambda (x - c)$ be a quadratic form
  • $\mathcal{E}(r \mid c, \Lambda) = \{y \in \mathbb{R}^n \mid Q(y) \leq r^2 \}$ be the hyperellipsoid of radius $r$ defined by $Q$.

Then, we are interested in computing $$ \mathcal{I}(r) = \int_{\mathcal{E}(r \mid c, \Lambda)} \mathcal{N}(x \mid \mu, \Sigma) \, d x, $$ where, by slight abuse of notation, I used $\mathcal{N}(x \mid \mu, \Sigma)$ to refer to the density of the Gaussian distribution with mean $\mu$ and covariance $\Sigma$ evaluated at the point $x$.

Let $LL^\top = \Sigma$ be the Cholesky factorization of $\Sigma$. Then, introducing the change of variables $u = L^{-1}(x - \mu)$, we can rewrite the integral as $$ \mathcal{I}(r) = \int_{\mathcal{E}(r \mid \tilde{c}, \tilde{\Lambda})} \mathcal{N}(u \mid 0, I) \, d u, $$ where $$ \tilde{c} = L^{-1}c - \mu \quad \text{ and } \quad \tilde{\Lambda} = L^\top \Lambda L. $$ Now, $\mathcal{I}(r)$ is the standard Gaussian mass of a hyperellipsoid in general position. Next, we can use the rotational symmetry of the standard Gaussian density to axis-align the hyperellipsoid. Let $O D O^\top$ be the eigendecomposition of $L^\top \Lambda L$, with $D = \mathrm{diag}(\alpha_1, \dots, \alpha_n)$ is a diagonal matrix containing the eigenvalues. Introducing a second change of variables $u = O v$, we get $$ \mathcal{I}(r) = \int_{\mathcal{E}(r \mid \hat{c}, D)} \mathcal{N}(v \mid 0, I) \, d v $$ where $\hat{c} = O^\top \tilde{c}$. Then, by definition, we have $$ \begin{align} \mathcal{I}(r) &= \int_{\mathbb{R}^n} \mathbf{1}\left[(v - \hat{c})^\top D (v - \hat{c}) < r^2 \right] \mathcal{N}(v \mid 0, I) \, d v \\ &= \int_{\mathbb{R}^n} \mathbf{1}\left[\sum_{i = 1}^n \alpha_i (v_i - c_i)^2 < r^2 \right] \mathcal{N}(v \mid 0, I) \, d v \\ &= \mathbb{P}\left[\sum_{i = 1}^n \alpha_i \chi^2_{1, c_i^2} < r^2 \right], \end{align} $$ where $\mathbf{1}[\cdot]$ is the indicator function and $\chi^2_{\nu, \lambda^2}$ is a noncentral chi-squared random variable with $\nu$ degrees of freedom and noncentrality $\lambda^2$. Making the definition $$ Q = \sum_{i = 1}^n \alpha_i \chi^2_{1, c_i^2}, $$ we say that $Q$ follows a generalized chi-square distribution. There is no particularly nice expression for the CDF for general $\mu$ and $\Sigma$. However, its characteristic function does have a nice form and can be numerically inverted to compute the CDF to arbitrary precision, e.g. using the method from [Imhof, 1961].

A noteworthy special case of the above, besides the ones you mention in your question, is when $\Sigma = \gamma^2 I$ and $\Lambda = \delta^2 I$ for some $\gamma, \delta > 0$ and $\mu$ is arbitrary. In this case, all the $\alpha_i$s are equal, and we find $$ \mathcal{I}(r) = \mathbb{P}\left[\chi^2_{n, \lVert \hat{c} \rVert^2} \leq \frac{r^2}{\gamma^2 \delta^2} \right], $$ that is, the volume can be computed as the CDF of a noncentral $\chi^2$ random variable evaluated at an appropriate point. This CDF is known as the complementary generalized Marcum $Q$-function and has many nice infinite series representations that can be used to evaluate it to arbitrary precision. This function is also available in many statistical packages, such as R and scipy.

References

Imhof, J. P. (1961). Computing the distribution of quadratic forms in normal variables. Biometrika, 48(3/4), 419-426.