How do I measure the performance of a model when only partial ordering matters?

Question

I have a number of input data points $\{ x_0, x_1, \ldots, x_n \}$ and their corresponding outputs $\{ y_0, y_1, \ldots, y_n \}$.

I also have two models $f, g : X \rightarrow Y$ that seek to estimate a partial ordering of the true outputs.

That is to say that the model $f$ is perfect if $x_i, x_j \in X, x_i < x_j \implies f(x_i) < f(x_j)$. (Apologies if the mathematical definition is off -- I mean that if two inputs have an ordering, that ordering is maintained after being transformed by the function $f$.)

I'm looking for an approach that assigns a score from $0$ to $1$ to each model, where $0$ means no orderings are preserved and $1$ means all orderings are preserved.

My questions:

Does such an approach exist and is it useful?
Is this a silly thing to measure? (ie. am I missing something obvious)

NB: I can add more information if necessary. I've tried to abstract away the cruft but if more details help I can definitely provide them. Thanks for your time.

Note that the rank correlations proposed by Arya are a great start, but a value of -1 means that you have perfect anticorrelation: $x_i<x_j \Longleftrightarrow f(x_i)>f(x_j)$. (From such an anticorrelation you can get to perfect correlation by considering $-f$.) You may want to think about this in the context of your requirement of $0$ as a value for a "no orderings" model. — Stephan Kolassa, Apr 10 '21 at 13:47
Thank you @StephanKolassa, I appreciate the note. In this case it actually works well for me as such an anticorrelation is the worst possible outcome; I was mistaken to say that no orderings would be the bottom limit. — sdasdadas, Apr 10 '21 at 18:18

Arya McCarthy · Accepted Answer · 2021-04-10T13:51:38.260

If you're okay with the values being in $[-1, 1]$ instead of $[0, 1]$, then there are several rank correlation measures out there. (It's trivial to rescale this to $[0, 1]$ if that's a crucial desideratum of your measure. $-1$ means perfect anticorrelation.) Goodman and Kruskal's $\gamma$ is a good fit; an example work using it is here.

Many rank correlation measures are based on pair-counting: how many $(i, j)$ pairs are concordant ($N_s$, agreeing in relative rank) and how many are discordant ($N_d$, disagreeing).

Unlike Kendall's $\tau$, Goodman and Kruskal's $\gamma$ divides by the number of concordant and discordant pairs. The denominator excludes pairs where $x_i$ and $x_j$ have the same rank, which happens when there's a partial order. It means that the maximum will truly be 1 when your partial order is exactly matched.

$$G = \frac{N_s - N_d}{N_s + N_d}$$

For Kendall's $\tau$, 1 would be unachievable in the presence of a partial order.

There is one limitation, though, to pair-counting measures like these. If $x_i$ and $x_j$ have the same rank, we will effectively ignore the rank applied to $f(x_i)$ and $f(x_j)$. For both $\gamma$ and $\tau$, given a list $\vec{x}$, it's possible that $f$ produces a total order that achieves the maximum possible correlation.

You never did anything with $g$, so I'm assuming it's just there as a model to compete against $f$.

This is wonderful and exactly what I was looking for, thank you! — sdasdadas, Apr 09 '21 at 22:54

How do I measure the performance of a model when only partial ordering matters?

1 Answers1