9

I created a Web page that pulls in live Olympic medal results from Thompson Reuters and worldwide population counts from the CIA.

The results are interesting to me - Hungary has a double digit lead in gold medals over the rest of the world. Also, the USA and China are near the bottom in just about every category.

My question is - am I presenting the data in a fair manner? I simply took the largest population then created a factor for each country based on that. Relative medal count columns are based on that factor.

What column(s) could I add? What other factors could I add to present the fairest view? The absolute view is easy - Reuters does that. How to create a fair view?

https://rack.pub/rio

enter image description here

  • 3
    At the moment this question is quite unclear. What does "double digit lead in gold" mean? When you say "created a factor for each country based on that", how was the factor created? Is this exercise essentially just working out "medals per capita", possibly rescaled in some way? – Silverfish Aug 13 '16 at 00:09
  • It's the ratio of the medal count to the population. If you hover over the column name there's a tool tip. – Matthew Drury Aug 13 '16 at 00:13
  • 2
    That's the reaction I get from everyone I share the view with. Maybe I am not explaining it well. The populations are China 1,367,485,388, USA 321,368,864, Hungary 9,897,541 so the factor would be 1 for China, 4.26 for USA, and 138.16 for Hungary. The double digit lead means what it says - the relative gold medal count is double the next closest country. – Ronnie Royston Aug 13 '16 at 00:17
  • Ahhh, that's very useful. I can see how that would be difficult to express succinctly. I apologize for my misinformation. – Matthew Drury Aug 13 '16 at 00:19
  • It's been a long time since I took Statistics but I know that sample size matters. So, calculating relative medal counts can easily produce misleading results - thus the filter checkbox. It would be nice to show a "confidence factor" that accounts for strength of sample size - but I could use some help with that formula / math. – Ronnie Royston Aug 13 '16 at 00:21
  • 5
    I don't think assessing the medal count relative to a country's population makes much sense. Do you think China & India 'should' be winning >1/3 of all medals? At any rate, this seems like a question for subject matter experts; it doesn't seem like a statistical question. – gung - Reinstate Monica Aug 13 '16 at 00:40
  • In the USA, many high schools are categorized by enrollment into A, AA, AAA, AAAA, etc. That way football games, for example, between schools are made fair. A tiny school playing a huge school is not considered fair. Yes, if China had, for example, 1/2 the worlds population, then they should win 2x more medals than anyone else. Isn't that fair? If not, explain, ...that's why I asked the question. – Ronnie Royston Aug 13 '16 at 00:46
  • 6
    @RonRoyston One reason to suspect it isn't fair is that Olympics contests limit the number of athletes from each country. The details differ between sports, but it would be mathematically impossible for a country with 90% of global population to get 90% of the medals for that reason - on many podiums they would be limited to one or at most two medals. So strict proportionality can't hold. – Silverfish Aug 13 '16 at 00:54
  • 3
    Consider a medal contest where only one team or individual per country can be entered. Supposing talent and training were uniformly distributed, one might expect Chinese athletes to form one sixth of the places in the world's top 100 in that sport, but a much lower proportion of Olympic competitors! – Silverfish Aug 13 '16 at 00:58
  • How can I account for that? Get # of athletes per country? Also, notice India has not won a single medal despite being nearly the same population size as China. – Ronnie Royston Aug 13 '16 at 01:08
  • In general the limit is three per event but many sports (for instance the team events) have a lower limit of one. – mdewey Aug 13 '16 at 09:59
  • @mdewey Not true. Many events can have a very large number of medals in the event of ties or special rulings by the officials. – Mark L. Stone Aug 13 '16 at 15:40
  • 2
    These comments suggest this question raises interesting and important statistical issues. – whuber Aug 13 '16 at 17:42
  • @MarkL.Stone I meant three competitors per event, not three medals. – mdewey Aug 13 '16 at 21:04
  • Could you please edit your post (including the title) to reflect the clarifications in your comments (specifically in relation to the fact that it's medals per person not actual medals that you're comparing) – Glen_b Aug 13 '16 at 23:07
  • @Glen_b Done. If you subject matter experts in Statistics could provide me a good formula, I would really appreciate it, and would apply it to the view/chart. – Ronnie Royston Aug 14 '16 at 00:18
  • 1
    I presume formulas would have to be based on the rate at which we "expect" medals should relate to population. Suitable formulas are actually likely to differ from sport to sport. In some cases good approximations might be found, perhaps relating to relations between sets of extreme value distributions, but in general this would be analysis requiring data, in which case numerous other important predictors of medal tally would need to be accounted for (e.g. resources relating to sport in a country, home field advantage, whether effective drug testing and penalties are being applied). ... ctd – Glen_b Aug 14 '16 at 00:37
  • 1
    ctd ... in other words I expect that any truly reasonable kind of approach to an overall medal tally is likely to be very involved, and yet also likely to be open to a variety of criticisms. Generally with medals-per-head totals you find some tiny country with a single gold wins that each time (but which country wins changes regularly). For example, in this Olympics, (tiny) Fiji won its first gold. Its population is 881000. Any smaller nation with a gold medal will beat it on your measure. If you restrict it to at least 5 gold (say), then the smallest country to win 5 gold will generally win. – Glen_b Aug 14 '16 at 00:39
  • Thanks Glen. The Jamaican bobsled team, for example. That is a good point. However, the summer Olympics seem more fair as most countries populations have access gyms, tracks, pools, etc. Regarding Fiji, every country has qualifying events / try outs. The larger the population, the higher quality the qualifiers - that seems like common sense. – Ronnie Royston Aug 14 '16 at 02:45
  • Don't trust too much your common sense. Have you considered how much investment/incentives there are from the countries? the federations? Some 'organisations' do not give a *****, others use this event to promote themselves. Some have advanced tech/medic for training, others don't. Some athletes are pro, others amateurs depending on federations rules, but also from incentive from the home country to focus or not on the event. You really should incorporate some money factor. – mic Aug 21 '16 at 13:19
  • Another interesting possibility would be to compute the rate of medals (gain) compared to the number of athletes sent to the Olympics (expected gain) to get some sort of "efficiency" of each delegation. – meduz Sep 05 '16 at 07:29
  • @meduz That's a great idea. I'll do it! ..Check back shortly. I found the data too here http://www.mapsofworld.com/sports/olympics/summer-olympics/participating-nations.html – Ronnie Royston Sep 05 '16 at 19:57

2 Answers2

2

Smaller countries can get an advantage in two ways.

  • Systematic advantage because the number of athletes per country is limited.

    Large countries like the USA and China, who have populations 25 and 100 times larger than Hungary, are not sending an equivalent amount of athletes or teams. For instance, in many team sports there is only one team per country competing and in individual sports the number of entries per country is limited.

  • Stochastic advantage because variations are larger for smaller countries.

    The coefficient of variation for the number of medals will be smaller when the expected value is larger.

    Example if every athlete rolls a six sided dice. Then the countries with the largest (but also the smallest) average dice roll will often be countries with a smaller amount of athletes. See the simulation below where we have hundred countries with 50 athletes and hundred countries with 200 athletes.

    example

    Idea for the image from vondj's YouTube video: Kleine Schulen sind besser! Lügen mit der gefährlichsten Formel der Welt


I imagine a graph of number of medals versus population might show some interesting insights. (There are several on the internet, but they are often for only a single year and with some noise, am average of several editions might give a good view of the relationship between medals and population)

-1

You are trying to find an estimate of any individual's chance to win a medal, knowing that the "data" we have is just the number by country. It's a great question a fair solution being closer to the spirit of Olympics.

Basically, this is a statistical problem which is well approximated by your method as the average number (frequency) of medals (for each color) relative to the population. But how reliable is this method? This is pretty close to the problem of estimating the reliability of a binomial toss from different number of throws which has applications for instance to compare the quality of resellers in Amazon based on different feedback numbers (see this thorough explanation).

In this particular case, the population number is always enough to make the approximation of the beta distribution with a normal - such that it is certainly possible to compare the significativity of each estimate for each country.

meduz
  • 577
  • 3
    The medal counts are not independent (as assumed by your model). The most profound effect is due to the accumulation of multiple medals by individuals. – whuber Aug 21 '16 at 14:51
  • Right, this would mean that it would be necessary to use rank statistics I guess. – meduz Sep 05 '16 at 07:27
  • Why would using rank statistics ameliorate the non-independence of individuals wrt medal counts? – Sycorax Jan 08 '23 at 22:13
  • @Sycorax thanks for your comment. I was suggesting rank statistics in my comment as it makes the comparison independent of the shape of the statistics, something that can be beneficial when comparing countries with different population numbers for instance. – meduz Jan 10 '23 at 09:30