Which is largest, of a bunch of normally distributed random variables?

Question

I have random variables $X_0,X_1,\dots,X_n$. $X_0$ has a normal distribution with mean $\mu>0$ and variance $1$. The $X_1,\dots,X_n$ rvs are normally distributed with mean $0$ and variance $1$. Everything is mutually independent.

Let $E$ denote the event that $X_0$ is the largest of these, i.e., $X_0 > \max(X_1,\dots,X_n)$. I want to calculate or estimate $\Pr[E]$. I'm looking for an expression for $\Pr[E]$, as a function of $\mu,n$, or a reasonable estimate or approximation for $\Pr[E]$.

In my application, $n$ is fixed ($n=61$) and I want to find the smallest value for $\mu$ that makes $\Pr[E] \ge 0.99$, but I'm curious about the general question as well.

How large is $n$? There ought to be some good asymptotic expressions based on large-sample theory. — whuber, Nov 15 '12 at 19:59
@whuber, thanks! I edited the question: in my case $n=61$. Even if $n=61$ isn't large enough to count as large, if there are good asymptotic estimates in the case where $n$ is large, that'd be interesting. — D.W., Nov 15 '12 at 20:18

Dilip Sarwate · Accepted Answer · 2023-09-21T13:37:37.783

The calculation of such probabilities has been studied extensively by communications engineers under the name $M$-ary orthogonal signaling where the model is that one of $M$ equal-energy equally likely orthogonal signals being transmitted and the receiver attempting to decide which one was transmitted by examining the outputs of $M$ filters matched to the signals. Conditioned on the identity of the transmitted signal, the sample outputs of the matched filters are (conditionally) independent unit-variance normal random variables. The sample output of the filter matched to the signal transmitted is a $N(\mu,1)$ random variable while the outputs of all the other filters are $N(0,1)$ random variables.

The conditional probability of a correct decision (which in the present context is the event $C = \{X_0 > \max_i X_i\}$) conditioned on $X_0 = \alpha$ is $$P(C \mid X_0 = \alpha) = \prod_{i=1}^n P\{X_i < \alpha \mid X_0 = \alpha\} = \left[\Phi(\alpha)\right]^n$$ where $\Phi(\cdot)$ is the cumulative probability distribution of a standard normal random variable, and hence the unconditional probability is $$P(C) = \int_{-\infty}^{\infty}P(C \mid X_0 = \alpha) \phi(\alpha-\mu)\,\mathrm d\alpha = \int_{-\infty}^{\infty}\left[\Phi(\alpha)\right]^n \phi(\alpha-\mu)\,\mathrm d\alpha$$ where $\phi(\cdot)$ is the standard normal density function. There is no closed-form expression for the value of this integral which must be evaluated numerically. Engineers are also interested in the complementary event -- that the decision is in error -- but do not like to compute this as $$P\{X_0 < \max_i X_i\} = P(E) = 1-P(C)$$ because this requires very careful evaluation of the integral for $P(C)$ to an accuracy of many significant digits, and such evaluation is both difficult and time-consuming. Instead, the integral for $1-P(C)$ can be integrated by parts to get $$P\{X_0 < \max_i X_i\} = \int_{-\infty}^{\infty} n \left[\Phi(\alpha)\right]^{n-1}\phi(\alpha) \Phi(\alpha - \mu)\,\mathrm d\alpha.$$ This integral is more easy to evaluate numerically, and its value as a function of $\mu$ is graphed and tabulated (though unfortunately only for $n \leq 20$) in Chapter 5 of Telecommunication Systems Engineering by Lindsey and Simon, Prentice-Hall 1973, Dover Press 1991. Alternatively, engineers use the union bound or Bonferroni inequality $$\begin{align*} P\{X_0 < \max_i X_i\} &= P\left\{(X_0 < X_1)\cup (X_0 < X_2) \cup \cdots \cup (X_0 < X_n)\right\}\\ &\leq \sum_{i=1}^{n}P\{X_0 < X_i\}\\ &= nQ\left(\frac{\mu}{\sqrt{2}}\right) \end{align*}$$ where $Q(x) = 1-\Phi(x)$ is the complementary cumulative normal distribution function.

From the union bound, we see that the desired value $0.01$ for $P\{X_0 < \max_i X_i\}$ is bounded above by $60\cdot Q(\mu/\sqrt{2})$ which bound has value $0.01$ at $\mu = 5.09\ldots$. This is slightly larger than the more exact value $\mu = 4.919\ldots$ obtained by @whuber by numerical integration.

More discussion and details about $M$-ary orthogonal signaling can be found on pp. 161-179 of my lecture notes for a class on communication systems.

score 6 · Answer 2 · answered Nov 15 '12 at 20:14

6

A formal answer:

The probability distribution (density) for the maximum of $N$ i.i.d. variates is: $p_N(x)= N p(x) \Phi^{N-1}(x)$ where $p$ is the probability density and $\Phi$ is the cumulative distribution function.

From this you can calculate the probability that $X_0$ is greater than the $N-1$ other ones via $ P(E) = (N-1) \int_{-\infty}^{\infty} \int_y^{\infty} p(x_0) p(y) \Phi^{N-2}(y) dx_0 dy$

You may need to look into various approximations in order to tractably deal with this for your specific application.

answered Nov 15 '12 at 20:14

Dave

3,259

9

+1 Actually, the double integral simplifies into a single integral since $$\int_y^\infty p(x_0),\mathrm dx_0 = 1 - \Phi(y-\mu)$$ giving $$P(E) = 1 - (N-1)\int_{-\infty}^\infty \Phi^{N-2}(y)p(y)\Phi(y-\mu),\mathrm dy$$ which is the same as in my answer. – Dilip Sarwate Nov 15 '12 at 23:35

NN2 · Answer 3 · 2023-09-23T06:30:12.707

1

We have

$$\begin{align} \mathbb{P}(\mathcal{E}) &= \mathbb{P}(\bigcap_{i=1}^n \{X_0 \ge X_i \}) = \mathbb{P}(\bigcap_{i=1}^n \{X_0 -X_i \ge 0 \}) \end{align}$$

Let us denote $Z_i = X_0 - X_i$ for $i = 1,...,n$ then the vector $(Z_1,...,Z_n)$ of dimension $n$ follows the n-variate normal distribution $\mathcal{N}_n(\Gamma,\Sigma)$ where $\Gamma \in \mathbb{R}^n$ the mean vector and $\Sigma\in \mathbb{R}^{n \times n}$ the covariance matrix. We determine $\Gamma$ and $\Sigma$, we have $$\Gamma_i=\mathbb{E}(Z_i) =\mu \hspace{1cm} \forall i=1,...,n$$ $$\Sigma_{ij} = \text{Cov}(Z_i,Z_j) = \left\{ \begin{array}{ll} 2 & \mbox{if } i=j \\ 1 & \mbox{if } i\ne j \\ \end{array} \right. \tag{1} $$

Then, the probability of the event $\mathcal{E}$ can be calculated with a closed-form expression as follows

$$\color{red}{\mathbb{P}(\mathcal{E}) = \Phi_n \left(\mathbf{l}, \mathbf{u};\mathbf{0}_n;\Sigma \right)}$$ with

$\Phi_n() $ the multivariate normal probability
$\mathbf{0}_n \in \mathbb{R}^n$ vector of $0$ (we migrate these $\mu$ to the lower vector $\mathbf{l}$ )
$\Sigma$ is defined in $(1)$
$\mathbf{l} \in \mathbb{R}^n$ the lower vector with all elements are equal to $-\mu$
$\mathbf{u} \in \mathbb{R}^n$ the upper vector with all elements are equal to $+\infty$

For the computation of $\Phi_n$, Genz is the most famous author in the field of computation of $\Phi_n$ as I know and in almost all the programming languages implement his algorithms (e.g., Python, Matlab, R, Fortran, C++)

edited Sep 23 '23 at 06:30

answered Sep 21 '23 at 17:50

NN2

188

1

A formula (with lots of fancy notation) to describe the desired probability is awesome but it begs the question about how one computes the numerical value of the desired probability. $\Phi_n(\cdot)$ is not a tabulated function for $n>2$, nor are there widely known techniques for computing $\Phi_n$. Compare this to $\Phi_1$ for which most modern "scientific" calculators have buttons to do the calculation. – Dilip Sarwate Sep 22 '23 at 21:53
I don't think my formula has a lot of fancy notation, because in mathematics, only accurate notation is required and I put only the necessary notation. The OP asked for an expression of $\mathbb{P}(\mathcal{E})$ and I provided the exact, analytical formula as request. Besides, about the computation, I haven't known any calculator having button for $\Phi_1$. I just know that all modern scientific programming language (Matlab, Python, Fortran, R, C++, ...) have a function for $\Phi_n$ (that we can compute $\Phi_n$ with just 1 command line). I just added some links in the end of the answer. – NN2 Sep 23 '23 at 06:21
Could you elaborate on what you mean by "$\Phi_n$" then? The only correct implementations of this answer that I am aware of require numerical integration and certainly do not reduce to a calculation of the error function. – whuber Sep 23 '23 at 15:11
@whuber $\Phi_n$ is the multivariate normal probability (aka: cumulative distribution function of a multivariate normal distribution) (Ref: Genz, Numerical computation of multivariate normal probabilities ). – NN2 Sep 23 '23 at 15:22
In that case, you are responding to @Dilip's point by saying that the numerical integration has been implemented in Matlab, etc. That's not the standard meaning of an "analytic" solution. That's not necessarily an important issue. It is indeed helpful to know of different but equivalent ways of expressing the same answer to a problem. – whuber Sep 23 '23 at 15:59
@whuber Ah, I thought analytical solution = solution with closed-form expression, but it seems not to be the case. – NN2 Sep 23 '23 at 16:15
@DilipSarwate For the second comment, please read it as "I don't think my formula has a lot of fancy notation, because in mathematics, only accurate notation is required and I put only the necessary notation. The OP asked for an expression of P(E) and I provided the exact formula with closed-form expression as requested...." – NN2 Sep 23 '23 at 16:16

Which is largest, of a bunch of normally distributed random variables?

3 Answers3

Linked