2

Question

While reading a Wikipedia article on Markov's Inequality, I came across the statement

$$E(X) = P(X<a)E(X|X<a) + P(X \geq a)E(X|X\geq A)$$

In the context of Markov's inequality, we are assuming $X$ is a non-negative r.v., but I don't think that's necessary for the statement to be true.

How do I prove this? I think I should use iterated expectation $E(E(X| ??)$ but I'm not exactly sure how. I'm used to things like $E(E(X|Y=y))$, not inequalities.

If it's relevant, I do not know measure theory yet.

Solution

The answer below is totally sufficient, but for a beginniner like me I felt it was useful to dive into the details. In case those details help anybody else, here they are:

We want to prove

$$E(X) = P(X<a)E(X | X<a) + P(X\geq a)E(X|X\geq a)$$

An answer below shows that

\begin{align} E[X] &= \int_{-\infty}^\infty xf(x)dx \\ &= \int_{-\infty}^a xf(x)dx +\int_a^\infty xf(x)dx \\ &= \Pr(X<a)\int_{-\infty}^a \frac{xf(x)}{\Pr(X<a)}dx +\Pr(X\ge a)\int_a^\infty \frac{xf(x)}{\Pr(X\ge a)}dx \\ &= \Pr(X<a)\int_{-\infty}^\infty xf(x\mid X<a)dx +\Pr(X\ge a)\int_{-\infty}^\infty xf(x\mid X\ge a)dx \\ &= \Pr(X<a)E[X\mid X<a] +\Pr(X\ge a)E[X\mid X\ge a] \end{align}

Where $f(x)/\Pr(X<a)$ is the conditional pdf $f(x | X<a)$. The purpose of this extended answer is to investigate how this conditional probability works.

Typically we denote the conditional pdf for random variables $X,Y$ as

$$f_{X|Y}(x|y) = f_{X,Y}(x,y)/f_Y(y)$$

But what is the random variable $Y$ in this case? For convenience, denote $A^-=(-\infty,a)$ and $A^+=[a,\infty)$, then we define $L(X) = I_{A^-}(X)$ where $I_{A^-}$ is the indicator function on set $A^-$.

\begin{equation} L(X) = I_{A^-}(x) = \begin{cases} 1 & \text{if $x\in (-\infty,a)$}\\ 0 & \text{if $x\in [a,\infty)$} \end{cases} \end{equation} Then

$$f_{X|L}(x|l) = \frac{f_{X,L}(x,l)}{f_L(l)}$$

where

\begin{equation} f_L(l) = P(L(x)=l) = \begin{cases} P(x\in A^-) & \text{if $l=1$}\\ P(x\in A^+) & \text{if $l=0$}\\ 0 & \text{otherwise} \end{cases} \end{equation}

and $f_{X,L}(x,l)$ is defined such that

$$\Pr(X\in U,L\in V) = \sum_{l\in V}\int_{x\in U}f_{X,L}(x,l)dx $$

Note we must have

$$f_{X,L}(x,1) = 0 ~\text{ for }~ x\in A^+$$

since by definition of the indicator function

$$P(X\in U\subset A^+ \land L=I_{A^-}(X)\in\{1\}) = 0$$

Also, by definition of the marginal pdf $f_L(l)$, we know that

$$\int_{x\in\mathbf{R}}f_{X,L}(x,1)dx = f_L(1) = P(X\in A^-) = P(X<a) = \int_{x<a}f(x)dx$$

This implies $f_{X,L}(x,1)=f(x)$ for $x<a$.

Similarly

$$f_{X,L}(x,0) = 0 ~\text{ for }~ x\in A^-$$ since by definition

$$P(X\in U\subset A^-,L\in\{0\}) = 0$$

Again, by definition of the marginal pdf, we know

$$\int_{x\in\mathbf{R}}f_{X,L}(x,0)dx = f_L(0) = P(X\in A^+) = P(X\geq a) = \int_{x\geq a}f(x)dx$$

which implies $f_{X,L}(x,0) = f(x)$ for $x\geq a$.

Combining all this information, we see that

\begin{equation} f_{X,L}(x,l) = \begin{cases} f(x)I_{(-\infty,a)}(x) & \text{if $l=1$}\\ f(x)I_{[a,\infty)}(x) & \text{if $l=0$}\\ 0 & \text{otherwise} \end{cases} \end{equation}

Now we can give an explicit expression for $f_{X|L}(x|l)$:

\begin{equation} f_{X|L}(x|l) = \frac{f_{X,L}(x,l)}{f_L(l)} = \begin{cases} & \frac{f(x)I_{(-\infty,a)}(x)}{P(X<a)} ~~\text{if $l=1$}\\ & \frac{f(x)I_{[a,\infty)}(x)}{P(X\geq a)} ~~\text{if $l=0$} \end{cases} \end{equation}

Essentially, the condition $X<a$ is represented with an indicator function $(X<a)(x) = I_{(-\infty,a)}(x)$ and the new, conditioned random variable $X|X<a$ is given by $(X|X<a) = X I_{(-\infty,a)}(X)$. Using this notation, we could write

$$X|L \sim f_{X|L}(x|l) = f_{X|X<a}(x | (X<a)(x)) \sim (X|X<a)$$

Now we can make sense of the proof given at the beginning.

\begin{align} \int_{-\infty}^a xf(x)dx &= \Pr(X<a)\int_{-\infty}^a x\frac{f(x)}{\Pr(X<a)}dx\\ &= \Pr(X<a)\int_{-\infty}^a xI_{A^-}(x)\frac{f(x)I_{A^-}(x)}{\Pr(X<a)}dx\\ &= \Pr(X<a)\int_{-\infty}^a (x|x<a) f_{X|X<a}(x|1)dx\\ &= \Pr(X<a)E(X|X<a) \end{align}

and similarly

\begin{align} \int_a^\infty xf(x)dx &= \Pr(X\geq a)\int_a^\infty x\frac{f(x)}{\Pr(X\geq a)}dx\\ &= \Pr(X\geq a)\int_a^\infty xI_{A^+}(x)\frac{f(x)I_{A^+}}{\Pr(X\geq a)}dx\\ &= \Pr(X\geq a)\int_a^\infty (x | x\geq a)f_{X|X\geq a}(x|1)dx \\ &= \Pr(X\geq a)E(X|X\geq a) \end{align}

1 Answers1

3

\begin{align} E[X] &= \int_{-\infty}^\infty xf(x)dx \\ &= \int_{-\infty}^a xf(x)dx +\int_a^\infty xf(x)dx \\ &= \Pr(X<a)\int_{-\infty}^a \frac{xf(x)}{\Pr(X<a)}dx +\Pr(X\ge a)\int_a^\infty \frac{xf(x)}{\Pr(X\ge a)}dx \\ &= \Pr(X<a)\int_{-\infty}^\infty xf(x\mid X<a)dx +\Pr(X\ge a)\int_{-\infty}^\infty xf(x\mid X\ge a)dx \\ &= \Pr(X<a)E[X\mid X<a] +\Pr(X\ge a)E[X\mid X\ge a] \end{align}

  • I'm used to writing $f_{X|Y}(x|y) = f_{X,Y}(x,y)/f_Y(y)$ where $Y$ is a r.v., but how do we interpret $\text{Pr}(X<a)$ as the pdf/pmf of a r.v.? Can think of the indicator function $I_{(-\infty,a)}(x)$ as the discrete r.v. we are conditioning on? – EssentialAnonymity Nov 18 '20 at 21:49
  • So $f_{X|I_a}(x | I_a=1)=f_{X|X<a}(x|x<a) = f(x)/f_{I_a}(I_a(x)=1) = f(x)/\text{Pr}(X<a)$? This is my first time seeing something like this, I'm trying to make sure I understand it! – EssentialAnonymity Nov 18 '20 at 22:00
  • 1
    Yeah, that's the idea. Notice also that the conditioning changes the support; i.e., $f_{X\mid X<a}$ is zero for $X \ge a$. – abstrusiosity Nov 18 '20 at 22:24
  • 1
    As an aside, in a measure theory setting you could have a nicer proof starting from the fact $\Pr(A)=\Pr(A \cap (B \cup B^C)) = \Pr(A\cap B)+\Pr(A\cap B^C)=\Pr(A\mid B)\Pr(B)+\Pr(A\mid B^C)\Pr(B^C)$. – abstrusiosity Nov 18 '20 at 22:28
  • Hi again, I edited my answer to really delve into the details of conditioning the random variable $X$ on the condition or indicator function $I_{(-\infty,a)}(x)$. I'm sure you're busy -- but if by chance you get a moment to peruse my edit, I would love to hear your opinion on whether it makes any sense. Thank you again for the nice proof you provided – EssentialAnonymity Nov 20 '20 at 01:24
  • 1
    I looked it over quickly and I'd say you're working through the details correctly. One other aside is that a fundamental connection between probability and expectation is that for an indicator function $I_A$, $E[I_A]=\Pr(A)$. – abstrusiosity Nov 20 '20 at 15:44