3

The cumulative distribution function $F(x)=P(X\leq x)$ is a fixed probability number. I wonder what its variance is, if we let the argument be a random variable following the same distribution as $X$.

For continuous $X$, this variance can be readily computed because density and CDF are related through $f(x)=F'(x)$ (sorry for the odd notation for the definite integral, but due to a bug on StackExchange the vertical bar does not support subscripts but only superscripts): $$ Var(F(X)) = E(F(X)^2) - E(F(X))^2 = \int_{-\infty}^\infty F(x)^2f(x) dx - \left(\int_{-\infty}^\infty F(x) f(x) dx\right)^2 $$ $$= \frac{1}{3}F(x)^3\Big|^{-\infty,\infty} - \left( \frac{1}{2}F(x)^2\Big|^{-\infty,\infty} \right)^2 = \frac{1}{3} - \frac{1}{4} = \frac{1}{12}$$

This does not hold, though, for discrete random variables $X$, and I wonder whether there is a simple expression for this variance, too, for discrete variables. In this case, we have $$E(F(X)^2) = \sum_{x\in X(\Omega)} F(x)^2 P(X=x) = \sum_{i=1}^{|X(\Omega)|} F(x_i)^2 \Big(F(x_i)-F(x_{i-1})\Big)$$ where $x_0$ is set to $-\infty$, which means that $F(x_0)=0$. This looks like a candidate for Abel's summation by parts, but I could not see a way to simplify this expression.

Does someone know of other ways to compute $Var(F(X))$, or is there an interpretation of this variance that allows for a simple computation?

cdalitz
  • 5,132
  • 3
    If $X$ is continuous then $F(X)$ has a standard uniform distribution with variance $\frac1{12}$ – kjetil b halvorsen Sep 01 '23 at 11:58
  • @kjetil-b-halvorsen Ok, this is an other way to obtain the result for continuous $X$ and also a nice meaning of $F(X)$. Is anything similar known about $F(X)$ for discrete $X$? – cdalitz Sep 01 '23 at 12:07
  • The correction terms can be expressed as sums of the $f(x_i)^2,$ $f(x_i)^2F(x_i),$ and $f(x_i)^3.$ (Notice these are all zero for continuous distributions.) – whuber Sep 01 '23 at 14:50
  • 1
    @jbowman I do not see how treating it as a Bernoulli distribution helps. If I understand you correctly, you suggest to exchange the sums $\sum_x$ and $\sum_{y\leq x}$ (this latter sum occurs in $F(x)$), which leads to $\sum_x P(X\leq x)^2 P(X=x)=\sum_x P(X=x)^2 P(X\geq x)$. Is this easier to compute? – cdalitz Sep 01 '23 at 14:59

0 Answers0