Equivalent definition of stochastic dominance

Question

Note that a distribution function (cadlag etc) $F$ is said to be stochastically dominated by a distribution function $G$ if $F(x)\geq G(x)$ for all $x \in \mathbb{R}$. The following result characterizes stochastic dominance equivalently:

Theorem: $F$ is stochastically dominated by $G$ if and only if for every increasing function $u$ $$\mathbb{E}_F[u(x)] \leq \mathbb{E}_G[u(x)].$$

I have seen proofs for this result when $F$ and $G$ are absolutely continuous (and thus admit densities) using integration by parts. Is there a more general proof that holds for arbitrary distribution functions/measures on the real line?

To be clear, the integral definition implies the CDF one by using $u(x) = \mathbb{1}\{x \in (z,\infty)\}$ for all $z \in \mathbb{R}$. The converse direction doesn't seem immediately obvious.

Geometrically this is clear: stochastic dominance implies the graph of $G$ lies to the right of the graph of $F.$ The function $u$ merely distorts the horizontal axis, but does not alter the right-left relation, QED. You can translate this into a formal proof if you like. — whuber, Apr 12 '23 at 03:35
formally, you always have $$\mathbb{E}F[u(X)] = \int{\mathbb R} u(x)\ dF(x) $$ where the integral is to be understood as a Lebesgue-Stieltjes integral. Using the characterization of LS integral as a supremum of Daniell integrals (see wiki), which can be computed as Riemann-Stieltjes integrals, I believe the proofs you know for differentiable $F$ and $G$ should extend in a straightforward way (there may be technicalities on the way, but the idea remains the same) — Stratos supports the strike, Apr 12 '23 at 09:16

whuber · Answer 1 · 2023-04-16T14:11:06.677

This is not really a theorem about stochastic dominance: it's a property of areas. It comes down to this lemma, which will be applied in the last two paragraphs:

When $f:\mathbb R\to\mathbb R$ is an integrable function with non-zero norm $|f|=\int |f(x)|\,\mathrm dx \lt \infty$ and $\mathcal A$ is a set of positive measure $|\mathcal A| = \int_{\mathcal A}\mathrm dx \gt 0$ on which the values of $f$ all exceed some positive number $\epsilon \gt 0,$ then there exists an increasing (measurable) function $u$ for which the transformed function $f\circ u$ has a positive integral, $$\int_\mathbb{R}f(u(x))\,\mathrm dx \gt 0.$$

The idea is to make the image of $u$ focus on $\mathcal A$ while practically skipping over everything else: the integral is then at least $\epsilon$ (the minimum value of $f$ on $\mathcal A$) times the measure of $\mathcal A$ -- plus any negative contributions elsewhere. By limiting the latter we wind up with a positive integral.

In this illustration, the set $\mathcal A$ is highlighted in orange along the horizontal axis and the area under $f$ over the region $\mathcal A$ is shaded.

One such function $u$ is obtained by inverting the (strictly) increasing function

$$v(y) = \int_{-\infty}^y \mathcal{I}_\mathcal{A}(x) + \delta(1-\mathcal{I}_\mathcal{A}(x))\,\mathrm dx$$

for a positive $\delta$ to be determined. ($\mathcal I$ is the indicator function.)

This illustration graphs $v$ for $\delta = 0.05.$ Its slopes are $1$ (orange) and $0.05$ (gray).

The Fundamental Theorem of Calculus and the rule of differentiating inverse functions show the inverse $u=v^{-1}$ is (a) differentiable with (b) derivative equal to $1$ on $\mathcal A$ and $1/\delta$ elsewhere. Writing $v(\mathcal A)^\prime$ for the complement of $v(\mathcal A)$ within the image of $v$ (which is $\mathbb R$ itself), use the standard integral inequalities (Holder's, for instance) and the change of variables formula for integrals to deduce

$$\begin{aligned} \int f(u(x))\,\mathrm dx &= \int_{v(\mathcal A)} f(u(x))\,\mathrm dx + \int_{v(\mathcal A)^\prime} f(u(x))\frac{|u^\prime(x)|}{|u^\prime(x)|}\,\mathrm dx\\ &\ge \int_{v(\mathcal A)} f(u(x))\,\mathrm dx - \left(\sup_{x\in v(\mathcal A)^\prime} \frac{1}{|u^\prime(x)|}\right)\left|\int f(u(x))|u^\prime(x)|\,\mathrm dx\right|\\ &\ge |\mathcal A|\epsilon - \delta|f|. \end{aligned}$$

Taking $\delta = |\mathcal A|\epsilon / (2|f|)$ produces a strictly positive value, proving the lemma.

This illustration of the graph of $f\circ u$ shows how the horizontal axis has been squeezed at all places where $f\lt \epsilon,$ thereby giving the entire integral a positive value. Making $\delta$ sufficiently close to zero will effectively eliminate the dips in the graph below $\epsilon.$

As a corollary, applying the lemma to $-f$ shows that when there is a set of positive measure on which $f$ has negative values below $-\epsilon \lt 0,$ then there is an increasing function $u$ for which $f\circ u$ has a negative integral.

Consequently, if for all increasing (measurable) functions $u$ the integral in the lemma is positive, it follows that the set of places where $f$ has a negative value has measure zero.

That's the heart of the matter.

Let's pause to notice two things. The first is technical: in this construction of $u,$ $u^{-1}$ is also almost everywhere differentiable and therefore continuous and measurable, allowing us to focus on such "nice" functions.

The second is probabilistic: when $F_X$ is the distribution function of a random variable $X$ -- that is, $F_X(x)=\Pr(X\le x)$ -- and $u$ is an increasing (measurable) function with an increasing (measurable) inverse $u^{-1},$ then the distribution function of $u^{-1}(X)$ is

$$F_{u^{-1}(X)}(y) = \Pr(u^{-1}(X)\lt y) = \Pr(X \le u(y)) = F_X(u(y)).$$

That is, $F_{u^{-1}(X)} = F_X\circ u.$

Now observe that when $F$ and $G$ are distinct distribution functions for a random variable $X$ and $u$ is an increasing (measurable) function,

$$E_G[u^{-1}(X)] - E_F[u^{-1}(X)] = \int F(u(x)) - G(u(x))\,\mathrm dx = \int (F-G)(u(x))\,\mathrm dx.$$

(For the elementary proof see Expectation of a function of a random variable from CDF for instance. It's just an integration by parts.)

Proof of the theorem

Applying the corollary to the function $f = F-G$ (which has a nonzero norm since $F$ and $G$ are distinct), under the assumption $f$ has finite norm, shows that when all such integrals are positive, the set on which $F-G$ is negative has measure zero: that is, $G$ stochastically dominates $F,$ QED.

We can eliminate the finite-norm assumption by noting that $F-G$ can have an infinite norm only by diverging at infinity: it cannot have vertical asymptotes. (The values are differences of probabilities, whence they are bounded by $\pm 1.$) Consequently we can approximate $F-G$ on an expanding sequence of compact sets, such as the intervals $(-n,n)$ for $n=1,2,3,\cdots,$ and apply a limiting argument. But that should be viewed as a technicality, because the underlying idea remains the same, as expressed in the lemma.

Can you elaborate why the last inequality "$E_G[u] - E_F[u] = \cdots$" is true? According to the link you provided (and other technical conditions), $E_G[u] = \int_{-\infty}^\infty (1 - G(x))u'(x)dx$. With additional conditions (like $u$ is strictly increasing), it can be written as $\int_a^b [F(u^{-1}(x)) - G(u^{-1}(x))]dx$ (with $a$, $b$ are asymptotes of $u$), which is different from the last equation. — Zhanxiong, Apr 13 '23 at 19:30
In addition, if the last equation were true, then OP's question would be immediate (all he asked is to prove $E_G[u] \geq E_F[u]$ under the condition $F \geq G$), why would we need the lemma to get the job done? — Zhanxiong, Apr 13 '23 at 19:36
@Zhanxiong I have added some text to elaborate on the argument. I don't understand your second comment, because I don't believe that such as statement was all that was asked. — whuber, Apr 16 '23 at 14:12
The second comment comments the equation before your edit: "$E_G[u(X)] - E_F[u(X)] = \int (F - G)(u(x))dx$". The OP wants to prove $E_G[u(X)] \geq E_F[u(X)]$ if $F \geq G$. So if the pre-editing formula held, then $E_G[u(X)] \geq E_F[u(X)]$ would hold trivially (because the integrand is nonnegative). — Zhanxiong, Apr 16 '23 at 14:38

Zhanxiong · Accepted Answer · 2023-04-14T00:54:11.893

The integration by parts formula still holds for general distribution functions (under appropriate technical conditions). For example, Theorem 18.4 in Probability and Measure by Patrick Billingsley (do not confuse $F, G$ in this theorem with $F, G$ in your question):

Let $F$ and $G$ be two nondecreasing, right-continuous functions on an interval $[a, b]$. If $F$ and $G$ have no common points of discontinuity in $(a, b]$, then \begin{align} \int_{(a, b]}G(x)dF(x) = F(b)G(b) - F(a)G(a) - \int_{(a, b]}F(x)dG(x). \tag{1} \end{align}

Equation $(1)$ is a good start for proving the result of your interest -- if we can extend the integral region $(a, b]$ to $\mathbb{R}$. To this end, we would need to impose integrability conditions of $u$ (of course, for $(1)$ to hold, we need to also assume that $u$ and $F$ / $G$ have no common discontinuity in $\mathbb{R}$ and $u$ is right-continuous): \begin{align} \int_{\mathbb{R}}|u(x)|dF(x) < \infty, \; \int_{\mathbb{R}}|u(x)|dG(x) < \infty. \tag{2} \end{align}

By $(1)$, for every $n$: \begin{align} & \int_{(-n, n]}u(x)dF(x) = F(n)u(n) - F(-n)u(-n) - \int_{(-n, n]}F(x)du(x). \tag{3} \\ & \int_{(-n, n]}u(x)dG(x) = G(n)u(n) - G(-n)u(-n) - \int_{(-n, n]}G(x)du(x). \tag{4} \end{align} It then follows by $(3), (4)$ and $F(x) \geq G(x)$ for all $x \in \mathbb{R}$ that \begin{align} & \int_{(-n, n]}u(x)dF(x) - \int_{(-n, n]}u(x)dG(x) \\ =& u(n)(F(n) - G(n)) - u(-n)(F(-n) - G(-n)) - \int_{(-n, n]}[F(x) - G(x)]du(x) \\ \leq & u(n)(F(n) - G(n)) - u(-n)(F(-n) - G(-n)). \tag{5} \end{align}

If $u$ is non-negative, then the right-hand side of $(5)$ is bounded by $u(n)(F(n) - G(n))$, which can be rewritten as $u(n)(1 - G(n)) - u(n)(1 - F(n))$, which converges to $0$ as $n \to \infty$ by the integrability of $F, G$ and the monotonicity of $u$. Similarly, if $u$ is non-positive, then the right-hand side of $(5)$ is bounded by $-u(-n)(F(-n) - G(-n))$ and converges to $0$ as $n \to \infty$ (see the next paragraph for a more detailed derivation).

If $u$ takes both negative and positive values, it follows by the monotonicity of $u$ that for sufficiently large $N$, $u(N) > 0$ and $u(-N) < 0$, whence for all $n > N$, again by monotonicity of $u$: \begin{align} 0 \leq u(n)(1 - F(n)) \leq \int_n^\infty u(x)dF(x), \; \int_{-\infty}^{-n}u(x)dF(x) \leq u(-n)F(-n) \leq 0. \tag{6} \end{align} By condition $(2)$ and Lebesgue's dominated convergence theorem (DCT), $(6)$ implies that $u(n)(1 - F(n)) \to 0$ and $u(-n)F(-n) \to 0$ as $n \to \infty$. Similarly, $u(n)(1 - G(n)) \to 0$ and $u(-n)G(-n) \to 0$ as $n \to \infty$. Therefore, the right-hand side of $(5)$ always converges to $0$ as $n \to \infty$ for $u$ that is nondecreasing and integrable.

Now the result follows by passing $n \to \infty$ on both sides of $(5)$ (note that condition $(2)$ and DCT imply the left-hand side of $(5)$ converges to $E_F[u] - E_G[u]$).

Wow I can't believe I missed a remarkably simple theorem on partial integration with Stieltjes integrals (very classic application of Fubini). I haven't checked the details of your answer but it looks roughly correct and in line with how the proof in the absolutely continuous case goes. Can we assume that $u$ is right-continuous without any loss of generality (although I think that this shouldn't really impede the existence of the Stieljes integral with respect to $u$) — Yashaswi Mohanty, Apr 13 '23 at 20:34
@YashaswiMohanty Yes. Since $u$ is nondecreasing, it is very natural (or it is a convention in probability) to treat it as right-continuous (basically you have the freedom to modify the countably many jumping points of a monotone function to make it is right-continuous). — Zhanxiong, Apr 13 '23 at 21:19
I'm missing how DCT implies that the LHS of (5) goes to the expression we need. Don't we need that $\lvert u \mathbb{1}[-n,n] \rvert \leq u $ for that ? — Yashaswi Mohanty, Apr 13 '23 at 21:19
@YashaswiMohanty Exactly, and $|uI_{(-n, n]}| \leq |u|$ is trivial :). — Zhanxiong, Apr 13 '23 at 21:20

Equivalent definition of stochastic dominance

2 Answers2

Proof of the theorem

Linked