1

If we have a sample $X_1, X_2, \ldots, X_n \sim F$ then $\hspace{1mm}sup_x|F_n(x) -F(x)|\xrightarrow{a.s./p}0$.

Now, if I can come up with a theoretical cdf $F$ such that $\hspace{1mm}sup_x|F_n(x) -F(x)|\xrightarrow{a.s./p}0\hspace{1mm}$ am I allowed to claim that $X_1, X_2, \ldots, X_n \sim F$ ?

If not, could you give me a counterexample?

Werther
  • 53

1 Answers1

2

It depends on what you mean. If you mean that $X_1,\dots,X_n$ are iid from some distribution $G$, and it turns out that $\sup_x |F_n(x)-F(x)|\stackrel{p}{\to}0$, then yes, you can conclude that $F=G$.

Proof: The Glivenko-Cantelli theorem says $\sup_x |F_n(x)-G(x)|\stackrel{p}{\to}0$, so by the triangle inequality $\sup_x |F(x)-G(x)|\stackrel{p}{\to}0$ and as the LHS doesn't depend on $n$, $F=G$ (on a set of probability 1).

On the other hand, it's quite possible to have a non-iid random mechanism that generates $X_i$ such that $\sup_x |F_n(x)-F(x)|\stackrel{p}{\to}0$

For example, let $M$ be a real-valued random variable with $E[M]=0$, and let $X_i = M+\epsilon_i$, where $\epsilon_i$ are iid mean zero random variables. Conditional on $M=m$, $X_i$ are iid with the same CDF as $\epsilon_i+m$ and the Glivenko-Cantelli theorem says that $F_n$ converges uniformly to the CDF of $\epsilon_i+m$. This isn't the CDF of any individual $X_i$ (which would be that of $\epsilon_i+M$). That's the set-up of de Finetti's theorem and the Hewitt-Savage 0-1 law: an exchangeable sequence of random variables looks iid, but the limit is random rather than fixed.

There are other ways this can go wrong. Suppose $X_i$ are identically distributed with cdf $F$ but not independent. It's possible (under suitable conditions on long-range dependence) that the empirical cdf $F_n$ converges uniformly to $F$. It would still not be true that $X_i$ are iid from $F$.

Or, the distribution of $X_i$ could vary periodically. As $n\to\infty$, the CDF $F_n$ would still converge uniformly to a CDF of the mixture of $X_i$ over a period. For example, if $X_n\sim N(1,1)$ for even $n$ and $N(0,1)$ for odd $n$, $F_n$ will converge to the CDF of a 50:50 mixture of $N(0,1)$ and $N(1,1)$.

Thomas Lumley
  • 38,062
  • Thanks for the answer! Would I be asking too much if I request you to please explain how come the triangle inequality allows us to claim that $sup_x|F(x) - G(x)| \xrightarrow{\text{p}} 0$ ? – Werther Aug 30 '20 at 01:02
  • The supremum norm is a norm, so $\sup_x|F_n-F|+\sup_x|F_n-G|>\sup_x |F-G|$ and the first two go to zero (by assumption) – Thomas Lumley Aug 30 '20 at 02:23