6

In an article, I recently came across the mention of first and second order U-type statistics without further detail.

Does anyone know what U-type statistics are?
References will be highly appreciated.

gui11aume
  • 14,703
  • 5
    Do you mean in this sense? – whuber Jul 25 '12 at 19:42
  • Huber has found the Wikipedia description of U-statistics that come up in nonparametric statistics and were original found by Hoeffding. That is what I assumed you meant also when I saw the question. I don't think the term U-type is common though. – Michael R. Chernick Jul 25 '12 at 20:26
  • It looks like it. Any hint at what first and second order could be? – gui11aume Jul 25 '12 at 20:30
  • 2
    Probably averages over functions taking one argument vs. averages over all pairs for functions taking two arguments. A. van der Vaart's Asymptotic Statistics has a chapter that provides a lovely introduction to this topic. – cardinal Jul 25 '12 at 20:43
  • @cardinal yes that would make sense in the context. I will look it up. Thanks! – gui11aume Jul 25 '12 at 20:46

2 Answers2

3

From the comments and the answer I got that "U-type statistics" is jargon for "U-statistics".

Here are a couple of elements taken from the reference provided by @cardinal, and in the previous answer. A U-statistics of degree or order $r$ is based on a permutation symmetric kernel function $h$ of arity $r$

$$ h(x_1, ..., x_r): \mathbb{X}^r \rightarrow \mathbb{R}, $$

and is the average of that function taken over all possible subsets of observations from the sample. More formally

$$ U = \frac{1}{\left( \array{n\\r} \right)} \sum_{\Pi_r(n)}h(x_{\pi_1}, ..., x_{\pi_r}), $$

where the sum is taken over $\Pi_r$, the set of all unordered subsets chosen from $\{1, ..., n\}$. The interest of U-statistics is that they are asymptotically Gaussian provided $E \{ h^2(X_1, ..., X_r) \} < \infty$.

Example 1: The sample mean is a first order U-statistics with $h(x) = x$.

Example 2: The signed rank statistic is a second order U-statistics with $h(x_1, x_2) = 1_{\mathbb{R}^+}(x_1+x_2)$ (the function that is equal to $1$ if $x_1 + x_2 > 0$, and $0$ otherwise).

$$ U = \frac{1}{\left( \array{n\\2} \right)} \sum_{i=1}^{n-1} \sum_{j=i+1}^n 1_{\mathbb{R}^+}(x_i+x_i) $$

is the sum of pairs $(x_i, x_j)$ from the sample with positive sum $x_i+x_j > 0$ and can be used as test statistic for investigating whether the distribution of the observations is located at 0.

Example 3: The unit definition space $\mathbb{X}$ of $h$ need not be real. Kendall's $\tau$ statistics is a second order U-statistics with $\frac{1}{2} h((x_1, y_1), (x_2, y_2)) = 1_{\mathbb{R}^+}((y_2-y_1)(x_2-x_1)) - 1$.

$$ \tau = \frac{2}{\left( \array{n\\2} \right)} \sum_{i=1}^{n-1} \sum_{j=i+1}^n 1_{\mathbb{R}^+}((y_2-y_1)(x_2-x_1)) - 1 $$

is a measure of dependence between $X$ and $Y$ and counts the number of concordant pairs $(x_i, y_i)$ and $(x_j, y_j)$ in the observations.

gui11aume
  • 14,703
2

We have established that U-statistics are what the OP is looking for. I will address his second question about orer of U-statistics. The theory of U-statistics can be found in many books on nonparametrics and I am sure also in the various statistical encyclopedias. Here is a nice article by Tom Ferguson that summarizes the theory. I think it is actually a class tutorial on it. Here is what he says about order. The rest you can find in the paper

5. Degeneracy. When using U-statistics for testing hypotheses, it occasionally happens that at the null hypothesis, the asymptotic distribution has variance zero. This is a degenerate case, and we cannot use Theorem 2 to find approximate cutoff points. The general definition of degeneracy for a U-statistic of order $m$ and variances, $\sigma_1^2 \leq \sigma_2^2 \leq ... \leq \sigma_m^2$ given by (19) is as follows. Definition 3. We say that a U-statistic has a degeneracy of order $k$ if $\sigma_1^2 = · · · = \sigma_k^2 = 0$ and $\sigma^2_{k+1} > 0$.

http://www.math.ucla.edu/~tom/Stat200C/Ustat.pdf

  • @gui11aume Thanks for the nice editing job. I just fixed one thing (an extra 1 at the end after k+1). – Michael R. Chernick Jul 25 '12 at 21:19
  • 1
    @gu11aume. Do you know Tom Ferguson? He was a UCLA professor way back in the late 1970s when I was a graduate student. At Stanford would used his book in our graduate math stat course. It was a really good text and I think he writes very well. – Michael R. Chernick Jul 26 '12 at 18:14