1

The minimzer of the MSE $E[(a-X)^2]$ is $a=E(X)$, and the MSE can be decomposed into $E[(a-X)^2] = E[(a-E(X))^2] + Var(X)$.

I am wondering whether there exists a similar expression th MAE $E[|a-X|]$ in terms of its own minimizer $med(X)$?

Is there a known standard relationship between $E[|a-X|]$, $E[|a-med(X)|]$, and perhaps something like $E[|med(X)-X|]$? Or between $med[|a-X|]$, $med[|a-med(X)|]$, and $med[|med(X)-X|]$? Or anything else that resembles $E[(a-X)^2] = E[(a-E(X))^2] + Var(X)$?

FZS
  • 395
  • 1
    See here: https://stats.stackexchange.com/questions/7307/mean-and-median-properties (the arguments there are in terms of sample quantities but essentially the same arguments carry across; there's likely half a dozen other posts on site. Also see here: https://gregorygundersen.com/blog/2019/10/04/expectation-median-opt/ – Glen_b Dec 28 '22 at 03:16
  • 1
    See https://stats.stackexchange.com/questions/251600 for a generalization to arbitrary percentiles. There is no such decomposition, though, because it characterizes a Euclidean metric. – whuber Dec 28 '22 at 15:27
  • @whuber If you consider the decomposition for the median in my answer is "such a decomposition", then for the general quantiles the similar decomposition should hold as well (see my updated answer). Could you clarify the meaning of "it characterizes a Euclidean metric"? – Zhanxiong Dec 28 '22 at 16:03
  • @Zhanxiong It's the Pythagorean Theorem. – whuber Dec 28 '22 at 16:22
  • @whuber OK. My point is: even if a Pythagorean-like decomposition is impossible, a decomposition that reveals the relationship between $\rho_\tau(X - a)$ and $\rho_\tau(X - q_\tau)$ (which essentially utilizes the linearity of expectation) is still possible, right? – Zhanxiong Dec 28 '22 at 16:49
  • @Zhanxiong That doesn't look like a decomposition in the sense of the question, though. It really is giving a (relatively complex) formula for the difference between the two quantities – whuber Dec 28 '22 at 18:23
  • 1
    @whuber Well, I personally think that suffices to answer OP's question (if you check his last paragraph) in that it links $E[|X - a|]$ and $E[|X - m|]$. But I respect your viewpoint as well. – Zhanxiong Dec 28 '22 at 18:32

1 Answers1

4

The short answer to your question is: an analogous decomposition does exist and can be used to show that the minimizer of $\Delta(a) := E[|X - a|]$ is the median (see remarks below for the latter claim).

Denote the median of $X$ by $m$. Using the definition of expectation, we have (note that the essence of the proof, which is shared by the $L^2$ expectation decomposition, is "subtract-then-add"): \begin{align} & E[|X - a|] = \int_{-\infty}^a (a - x)dF(x) + \int_a^{\infty}(x - a)dF(x) \\ =& \int_{-\infty}^m(a - x)dF(x) + \int_m^a(a - x)dF(x) + \int_a^m(x - a)dF(x) + \int_m^\infty(x - a)dF(x) \\ =& \int_{-\infty}^m(m - x)dF(x) + \int_{-\infty}^m(a - m)dF(x) \\ &+ 2\int_a^m(x - a)dF(x) \\ &+ \int_m^\infty(x - m)dF(x) + \int_m^\infty(m - a)dF(x) \\ =& E[|X - m|] + (m - a)(P[X > m] - P[X \leq m]) + 2\int_a^m(x - a)dF(x) \\ =& E[|X - m|] + 2\int_a^m(x - a)dF(x). \end{align} In the penultimate step, we used the condition $P[X \leq m] = P[X > m] = 0.5$. Therefore, the decomposition is \begin{align} E[|X - a|] = E[|X - m|] + 2\int_a^m(x - a)dF(x). \tag{1} \end{align}

Note that the second term in the right hand side of $(1)$, which resembles the term $(E[X] - a)^2$ term in the $L^2$ decomposition, is always non-negative. Under the assumption that $F(x)$ is strictly increasing (so that $m$ is uniquely determined), it is immediate that $\Delta(a)$ is minimized at $a = m$ (for otherwise the integral is strictly positive). When the theoretical median is not unique, the minimizer of $\Delta(a)$ is not unique either.


For a general quantile position $\tau \in (0, 1)$ and the check function $\rho_\tau(u) := u(\tau - I_{(-\infty, 0)}(u))$, the similar decomposition to $(1)$ also holds as follows with the median $m$ replaced with the $\tau$-quantile $q_\tau$ (the proof is almost identical as above):

\begin{align} E[\rho_\tau(X - a)] = E[\rho_\tau(X - q_\tau)] + \int_a^{q_\tau}(x - a)dF(x). \tag{2} \end{align}

Note that $(1)$ and $(2)$ differ by a scaling factor $2$ because $|u| = 2\rho_{0.5}(u)$. Using $(2)$, it is also easy to conclude that $q_\tau = \operatorname{argmin}_{a \in \mathbb{R}^1}E[\rho_\tau(X - a)]$ given the same monotonicity condition of $F$.

Zhanxiong
  • 18,524
  • 1
  • 40
  • 73