1

For an iid sample $X_1, \dots, X_n$, is the rank statistic distribution free? I.e. for each $k$ s.t $1 \leq k \leq n$, will the distribution of the rank of $X_k$ not depend on the distribution of $X_i$?

I think yes. Suppose $X_i$ has a cdf $F$ which admits an inverse $F^{-1}$, then $F^{-1}(X_1), \dots, F^{-1}(X_n)$ will still be iid but with the uniform distribution over $[0,1]$. Since $F$ is invertible, both $F$ and $F^{-1}$ are strictly increasing, so the rank of $X_k$ and the rank of $F^{-1}(X_k)$ will be the same. So we can assume the distribution of $X_i$ to be the uniform distribution over $[0,1]$, and thus the rank of $X_k$ doesn't depend on the distribution of $X_i$.

If $F$ is not invertible, the ties might change when going from $X_1, \dots, X_n$ to $F^{-1}(X_1), \dots, F^{-1}(X_n)$. We cannot say the distribution of the rank statistic doesn't depend on the distribution of $X_i$?

Thanks and regards!

Tim
  • 19,445
  • 2
    This fact, and more, is a theorem of Renyi. Note that you do need some caveat regarding continuity of the distribution, but you don't really need to do the reduction to the uniform distribution that you've done. It follows from symmetry considerations. :-) – cardinal Mar 09 '13 at 19:42
  • @cardinal: Thanks! What theorem of Renyi? (References are also appreciated!) Why is the condiiton about continuity of cdf instead of invertibility of cdf? Is it correct that continuity implies invertibility? – Tim Mar 09 '13 at 19:56
  • @cardinal: I searched among some books on the internet, but couldn't find a theorem by Renyi regarding distribution free property of rank statistics. Could you point me somewhere? Thanks! – Tim Mar 10 '13 at 02:26
  • This may be useful: http://www.math.montana.edu/jobo/thainp/rankstat.pdf – Martin Van der Linden May 20 '19 at 19:02

1 Answers1

1

I've only ever heard the term distribution free used to describe inference, test-statistics, and asymptotic properties using limit theorems. For continuous densities, the empirical distribution function provides a one-to-one map between the observed values and their percentiles. It's important to recognize the rank statistic does not use the unobserved distribution function $F$, but the empirical DF that has notation $\hat{\mathbb{F}}$. This estimated function depends on the observed sample, so perturbing the value of $X_k$ will influence the empirical DF $\hat{\mathbb{F}}$ and its order statistic $U_i = \hat{\mathbb{F}}^{-1}(X_i) = \hat{\mathbb{F}}_{X_1, \ldots, X_n}^{-1}(X_i)$

AdamO
  • 62,637