Population version of Kendall's tau

Question

I am reading about concordance and Kendall's tau. The empirical formula for Kendall's tau is given by $$ t = \frac{c-d}{c+d},$$ where $c$ and $d$ are the numbers of concordant pairs and discordant pairs in the sample.

The population version of the equation is given in Nelsen's "Introduction to Copulas" as follows: $$ \tau_{X,Y} = P[(X_1-X_2)(Y_1-Y_2)>0] - P[(X_1-X_2)(Y_1-Y_2)<0], $$ where $(X_1,Y_1)$ and $(X_2,Y_2)$ are i.i.d. random vectors each with joint distribution $H$.

This makes sense. Essentially, if we'd like to measure Kendall's tau for a bivariate distribution, we would generate two bivariate random vectors which are samples from that distribution $H$ and then calculate the probabilities as specified.

Nelsen then defines a "concordance function" $Q$. It is defined as the difference in probabilities of concordance and discordance between two vectors $(X_1,Y_1)$ and $(X_2,Y_2)$ of continuous random variables with (possibly) different joint distributions $H_1$ and $H_2$, but with common margins $F$ and $G$.

$$ Q = P[(X_1-X_2)(Y_1-Y_2)>0] - P[(X_1-X_2)(Y_1-Y_2)<0],$$ where $(X_1,Y_1)$ and $(X_2,Y_2)$ are i.i.d. random vectors with joint distribution $H_1(x,y)$ and $H_2(x,y)$.

What is the intuitive meaning of the concordance function if $H_1$ is not equal to $H_2$? Are we measuring the concordance between two different distribution functions?

score 1 · Answer 1 · answered Mar 09 '18 at 17:10

Yes, we are measuring the difference between the probability of concordance and discordance for two observations coming from different distribution. I think $ Q(C_1,C_2) $ does not give you many informations (it turns out $ Q $ depends on the copulas, hence I used this notation) unless one of the copulas represents a "reference" copula.

For example, if you have a couple as $ (X,Y) $ with copula $ C $, you may have $ Q(C,M) $ near $1$, where $ M $ is the comonotonicity copula. This means the probability that one observation of your couple $ (X,Y) $ is concordant with an observation of the couple $ (X',Y') $ with same marginals but functionally dependent variables is much higher than the probability of discordance, hence $ Y $ tends to be higher when $ X $ is higher.

Other example: $ Q(C,\Pi)>0 $, where $ \Pi $ is the product copula. Then the probability that one observation of your couple is concordant with (lies on the 1st or 3rd quadrant with respect to) a random observation of the couple $ (X',Y') $, where $ X' $ and $ Y' $ are distributed as $X$ and $Y$ but independent, is less than the probability of discordance. So $ X $ and $ Y $ have to be somehow positively dependent. This is the idea behind Spearman's rho rank correlation coefficient.

Of course, now that you have some information about $ C $ you can use that copula as a "reference" as well.

Population version of Kendall's tau

1 Answers1

Linked