Suppose we have the median incomes of 150 different countries (in each country, the median income was calculated on a sample of people) - we don't have the individual incomes of each person in these countries, but rather, we are only presented with the median income for each country. However, we have the total population of each country and how many samples were used to calculate the median income of each country.
In this situation, suppose I am interested in finding out the variability of the median incomes. The first thing that comes to mind is to use the Bootstrap Method. For example:
- Take a random sample (with replacement) of size=150 from the original data
- Calculate the median income from this random sample
- Repeat the above two steps many times - now you will have an empirical distribution of sample medians
My Question: I am confused what to do from this point. From here, you can either take the mean of all these medians - or the median of all these medians. You can also identify different percentiles to place a bound on your estimates (e.g. 95th Confidence Interval on Median of Medians, 95th Confidence Interval on Mean of Medians). Are both of these approaches mathematically valid?
As I understand, the Bootstrap Method uses the Law of Large Numbers to show that the average result of finite experiments "converge in probability" to the average of the actual phenomena that these experiments are studying. The Law of Large Numbers does not make any similar claims about the Median.
So in this case, it appears that the above Bootstrap procedure I described will be calculating and placing bounds on the "Average Median" - and not on the Median of the Medians. I do not think it would be correct to take the median of all bootstrapped medians and then calculate the Confidence Interval around the "Median of the Medians". I think it would make more sense to take the mean of all bootstrapped medians and calculate the Confidence Interval around the "Mean of the Medians".
Is my understanding correct?
Note: In such a situation, we are limited in the conclusions that we can make from this data. Using the bootstrap method, we can not make any reliable conclusions on the median income of individual people in these countries - nor can we make conclusions about the individual countries (i.e. ecological fallacy). Rather, we can only make conclusions on the overall median income of all countries .