When can I average the averages?

Question

Let's be straightforward to my case. I'm building a matchmaking system where it produces groups of players. Each player has a weight that is obtained from his/her skill and experience. A group is good if it has low difference among its members' weights (meaning their weights are close to one another). I calculate the weight differences among the players in the same group and average them, resulting a number that determines the closeness of weights among a group's members. For example, a matchmaking process resulted these groups:

Group A

Player 1 weight: 25
Player 2 weight: 30
Player 3 weight: 10

Then the average weight differences among the players: (abs(25-30) + abs(25-10) + abs(30-25) + abs(30-10) + abs(10-25) + abs(10-30)) / 6 = (5 + 15 + 5 + 20 + 15 + 20) / 6 = 13.33.

Group B

Player 1 weight: 15
Player 2 weight: 40

Then the average weight differences among the players: (abs(15-40) + abs(40-15)) / 2 = (25 + 25) / 2 = 25.

Please note that the resulted groups of the matchmaking can have different number of members. It's up to the system's decision.

Then to examine all groups, I average those averages simply like this: (13.33 + 25) / 2 = 19.16500. My question is, is that the correct way to examine all groups? Or should I average the un-averaged differences among all groups like below?

(5 + 15 + 5 + 20 + 15 + 20 + 25 + 25) / 8 = 16.25

Of course the results are different, since I gave equal weight of every group in the first way, and I gave equal weight of every difference in the second way.

I have a feeling that it's better to give equal weight of every group regardless of how many of its members there are, because the matchmaking system is the one that combines the members, and the number of members for each combination is up to the system. It's the system's decision to put different or equal number of players in the groups. It's the system algorithm's responsibility. What do you think? Which one is the correct way and why is that?

Sorry if there's any unclear explanations. Thank you.

@mkt to find out whether the matchmaking system is good or not. I designed the system to group players based on their skills and exps. Simply I want to make skilled and experienced players meet with people like them, and make unskilled and inexperienced players meet with people like them too. The smaller the average the better it is. — Sena, Aug 04 '23 at 07:49
@mkt For further information, I designed multiple matchmaking systems, so I want to find out which one is the best among them. — Sena, Aug 04 '23 at 07:51

score 2 · Accepted Answer · answered Aug 04 '23 at 07:53

You can do it when the groups have the same size. Say you have two groups of three values, then

$$\begin{align} \frac{1}{2}\Big(\frac{x_1 + x_2 + x_3}{3} + \frac{y_1 + y_2 + y_3}{3}\Big) = \frac{x_1}{6} + \frac{x_2}{6} + \frac{x_3}{6} + \frac{y_1}{6} + \frac{y_2}{6} + \frac{y_3}{6} \end{align}$$

vs

$$\begin{align} \frac{1}{2}\Big(\frac{x_1 + x_2 + x_3}{3} + \frac{y_1 + y_2 + y_3 + y_4}{4}\Big) = \frac{x_1}{6} + \frac{x_2}{6} + \frac{x_3}{6} + \frac{y_1}{8} + \frac{y_2}{8} + \frac{y_3}{8} + \frac{y_4}{8} \end{align}$$

As you can see, with the second approach the values are not equally weighted so the average of averages would differ from the average of all the raw values taken together. You would need to use a weighted average, where the weights would be proportional to the group sizes.

So the average of averages calculates the average of the group averages, rather than of the observations. I can imagine where calculating it may make sense, for example when you want the size of the groups not to affect the result, so smaller groups have the same impact on the result as larger groups. Just, you need to keep in mind the difference and the interpretation of the result.

Sorry, if I'd like to go with weighted average approach, is it correct to use the average? (13.33 (3) + 25 (2)) / (3+2) = 17.99800 like that? Or should I use each difference directly? (53 + 153 + 53 + 203 + 153 + 203 + 252 + 252) / ((63) + (22)) = 15.4545455. 3 and 2 are the weights. — Sena, Aug 04 '23 at 09:20
@Sena if you have two groups of size 3 and 4, the weights would be 3/7 and 4/7 respectively. — Tim, Aug 04 '23 at 09:41

score 0 · Answer 2 · answered Aug 04 '23 at 07:52

I second mkt comments. What are you trying to accomplish?

You're comparing different weight distribution. There are many ways of doing so.

You could compare group means, group variances, skewness, entropy, check KL divergence..

When you take the abs diff you're essentially measuring the dispersion within the distribution.

If you first measure dispersion then average across group you have a measure of...average dispersion of groups.

If you average all dispersion you lose any notion of groups and simply have a measure of dispersion of your "population".

The reason your averages are different is not due to the 1/6 or 1/2 weighting, but due to the absolute value which makes your measure non linear. Had you used a linear operation at the numerator (e.g. a difference, a sum, etc) averaging the average would yield the same result.

When can I average the averages?

2 Answers2

Linked