Why can't we sum two interquartile ranges?

Question

The interquartile range is defined as the difference between the third and first quartiles:

$IQR = Q_3 - Q_1$

When studying intensive and extensive properties, only values of extensive properties can be summed, but the differences of values of intensive (or extensive) properties can usually be summed together. For example, if the outside air temperature rose by 2°C from day 1 to day 2, and by 3°C from day 2 to day 3, it rose by 5°C from day 1 to day 3.

For quartiles (or quantiles in general), this is not the case: two interquartile ranges cannot be summed together. Obviously, quantiles are summary statistics on observed variables, not directly-observed variables. Nevertheless, what kind of algebraic proof can we give to justify the fact that two interquartile ranges cannot be summed?

My intuition is that the growth of a variable across space or time is structurally different from the deltas between quantiles across a population, but I would really like to find a more robust justification for this intuition, backed by standard equations.

And to be clear, I make a distinction between expressions that are arithmetically valid but might make no sense from a statistical analysis standpoint, and expressions that not only make no statistical sense, but are first and foremost arithmetically invalid.

For example, on one hand, computing the mean of two IQRs is arithmetically valid, but whether it makes sense or not from a statistical analysis standpoint is probably open to debate. On the other hand, computing the sum of two IQRs not only makes no statistical sense, but it's also probably arithmetically invalid (this is the part that I would like to firmly establish).

How can one know that something probably makes no sense? Simply by the fact that nobody ever came up with a valid example for which it would make sense.

Now, it is obvious that the IQR is not additive, in the sense that in most cases:

$IQR(X + Y) \neq IQR(X) + IQR(Y)$

But this does not establish that two IQRs should never be summed.

Summability does not imply additivity (while additivity implies summability of course).

Or does it? In other words, can we find a summary statistic $S$ for which:

$S(X + Y) \neq S(X) + S(Y)$

$S(X) + S(Y)$ is arithmetically correct and makes statistical sense?

If the answer to this last question is negative, then we have a proper empirical definition of summability for summary statistics. If the answer is positive, then we do not, and the summability (not the additivity) of each summary statistic will need to be studied individually.

You can absolutely sum two quantiles (or IQR), but I don't know why you would do that, what that would represent. It's no different than summing two means to find a... sum of means? For example taking the mean of many medians will tell you the average median, which can be done and is valid. Your example with temperatures doesn't really have much to do with quantiles or your question. — user2974951, Jan 21 '20 at 11:20
@user2974951: Isn't summing two IQRs more like summing two standard deviations? In my understanding, Ismael wants to know why the sum of the IQRs of two random variables is not the IQR of their sum. — Igor F., Jan 21 '20 at 11:24
@IgorF. Yes that is a nice example, summing two SD is a weird thing to do. But it can be done, just like taking the mean of two IQR to get an average IQR. Whether that makes sense is a different question. — user2974951, Jan 21 '20 at 11:29
Summing two quantiles of an extensive variable is perfectly valid. Summing two quantiles of an intensive variable is not. Summing two IQRs makes no sense whether the variable is intensive or extensive. I know it makes no sense, but why? How can I prove that it makes no sense, at an algebraic level. I can show that it makes no sense with empirical evidence, but I would like something more generic. — Ismael Ghalimi, Jan 21 '20 at 11:33
Summing two means makes no sense because of the divisions involved in their computation. The algebraic proof for that is trivial. One could also develop a geometric proof for the same. But what is the equivalent proof for the summation of two IQRs? — Ismael Ghalimi, Jan 21 '20 at 11:36
The example with temperatures was there to show that differences of intensive variables are summable. And yet differences of quantiles are not. I find that interesting... — Ismael Ghalimi, Jan 21 '20 at 11:38
As far as arithmetic is concerned, the mean of two IQRs is perfectly valid. Whether it makes sense or not is a different question. But summing two IQRs not only makes no sense, but it "feels" invalid from an arithmetic standpoint. This is what I would like to establish. — Ismael Ghalimi, Jan 21 '20 at 11:41

Igor F. · Answer 1 · 2020-07-29T18:53:02.277

5

I'm not sure I understand your question. I'll assume it can be reformulated as:

Why is the sum of the IQRs of two random variables not the IQR of their sum?

as I already wrote in my comment.

In that case, the answer is very simple: Why should it be?

It is easy to find a counterexample. Assume you have two standard uniform random variables, $X, Y \sim U(0, 1)$. The IQR of each of them is $0.5$. But, the variable $Z = X+Y$ is not uniformly distributed. Its PDF is a convolution of the PDFs of $X$ and $Y$ and has a triangular form:

Using elementary geometry, it is easy to show that its IQR is $2 - \sqrt 2$, which is different from the sum of the IQRs of $X$ and $Y$ ($0.5 + 0.5 = 1$).

edited Jul 29 '20 at 18:53

answered Jan 21 '20 at 12:02

Igor F.

9,089

No, I never assumed this. But this is raising a very good question: can we find a summary statistic which values can be summed without the sum of the statistic to be equal to the statistic of the sum? If the answer is positive, my question is valid. If the answer is negative, your answer is great, because it provided an answer not only to my question, but to all similar questions. – Ismael Ghalimi Jan 21 '20 at 12:12
I am afraid 'can be summed' leaves too much room for interpretation. I would say that standard deviation is one such example. – Mickybo Yakari Jan 21 '20 at 12:15
@MickyboYakari How do you interpret the sum of two standard deviations? – Ismael Ghalimi Jan 21 '20 at 12:20
1

As I said, there is too much room for interpretation. I suppose your 'can't be summed' does not mean 'can't be summed', then. – Mickybo Yakari Jan 21 '20 at 12:48

Why can't we sum two interquartile ranges?

1 Answers1