Introduction
Let's consider a less direct approach that eventually tries to focus on order without trying to average the ranks in some sense.
Many aggregation functions, although not all, are non-monotonic. Some, such as summation or scalar multiplication, are permutation invariant. See Just Plugging in Ranks for a couple of examples. But there may be information about order and the preferences of individual judges that you may be throwing away. So let's consider another approach.
I alluded to a similar approach as below in Beyond Studying Order With Ranks.
Starter Kit for a Bayesian Model
Suppose $D=(V, E(V))$ is a random directed graph where each edge is a random indicator variable
$$E_{ik}=\mathbb{1}\left[ (i,k) \in \text{E}(V) \right].$$
The ranking in the data encode an ordering. For $r_{ij} \leq r_{kj}$ we can assign that to be an observed edge.
Developing a Bayesian model, we can assign to follow a Bernoulli likelihood:
$$E_{ik} \sim \text{Bernoulli}(p_{ik})$$
where
$$\text{logit} ( p_{ik} ) = \sum_{j=1}^m \beta_{ikj}\mathbb{1}(\text{Judge}=j) + \gamma_{ik}$$
and
$$\beta_{ikj} \sim \mathcal{N}(0,1)$$
$$\gamma_{ik} \sim \mathcal{N}(0,1)$$
You could potentially drop one of the judges if multi-collinearity is a problem.
From here you can go NUTS with sampling from the posterior and follow Bayesian workflow if you encounter challenges. Once you have posterior samples of this relation you'll be ready for the next step.
Post-Sample Processing
Getting Back to Order
The next step is to deal with the fact that your sampled digraphs may not have edge sets isomorphic to a partial order. This may be due to judges genuinely disagreeing with each other, leading to non-transitivity. But we can post-process our samples to obtain something of a "consensus in order" which I will make clear below.
Let's go from directed graphs to directed acyclic graphs in such a way that tries to preserve the ordering the arrows. This can be done with graph condensation, which can be visualized like this picture

where the blue-node digraph is a sample from our posterior and the yellow is the condensation of that sample.
Next we find the reachability relation, which is relevant because it preserves the order of the arcs, via the transitive closure. In effect we have taken a random relation and extracted the part of it that behaves like a random partial order.
Here are some implementations of these transformations:
Statistics on Partial Orders
These sampled partial orders can be graded, which provides a new set of ranks for groupings of items. Considering the preimage of the graph processing we've done, you can map grades back to the original items where items that shared a node in the condensation will have the same rank. These are in some sense "combined ranks" from the original ranks.
On each graded partial order you can compute the grade entropy. Section 3.5 of Seilis 2022 describes computing the grade entropy, which is just the entropy of the grades, as a way of quantifying how close the partial order is to being either a trivial order vs a total order. Doing this for each graded partial order gives a posterior distribution of grade entropies representing the uncertainty in the totality of the order. If the judges tend to closely agree with each other in the ordering then this will tend to be close to one. If there's complete disagreement then the grades will be concentrated near zero. And there's of course many possibilities in between.
Summary
- By assuming that the ranks were reflective of some random ordering that depended on the judges, we can create a probabilistic model over random graphs.
- The sampled graphs can be processed to extract partial orders reflecting what is ordered in the original graphs.
- Various mathematical functions can tell you about how strongly ordered the judges preferences are.