9

I am familiarizing myself with quantile regression. I understand it is first and foremost an estimation method such as e.g. OLS. But I wonder about the probability distribution models for which quantile regression makes sense.

To use an analogy, the OLS estimator $$ \hat\beta^{OLS}:=(X^\top X)^{-1}X^\top y $$ is a minimum variance linear unbiased estimator when the true probability distribution model is \begin{align} y &= X\beta+\varepsilon, \\ \varepsilon &\sim d(0,\sigma^2) \end{align} or in other words, $y\mid X\sim d(X\beta,\sigma^2)$, where $d$ is some (unspecified) probability distribution.
Moreover, $\hat\beta^{OLS}$ is a maximum likelihood estimator when the probability model is \begin{align} y &= X\beta+\varepsilon, \\ \varepsilon &\sim N(0,\sigma^2) \end{align} or in other words, $y\mid X\sim N(X\beta,\sigma^2)$. So in some sense, the OLS estimator "naturally implies" the above probability distribution models.

Question 1: What could we say about the quantile estimator $$ \hat\beta^{QR}_{\tau}:= \arg\min_{\beta}\sum_{i=1}^{n}\rho_{\tau}(y_i-X_i\beta) $$ where $\rho_{\tau}$ is the quantile loss function (and the superscript $^{QR}$ stands for "quantile regression")? Does it "naturally imply" a probability distribution model for $y\mid X$?

Question 2: If we assumed particular true conditional quantiles $$ \beta^{QR}_{\tau}:= \arg\min_{\beta}\mathbb{E}\left(\rho_{\tau}(y-X\beta)\right) $$ (where $\beta_{\tau}$ may differ across different values of $\tau$) for a continuum of quantiles between 0 and 1, we would get an implicit conditional probability model for $y\mid X$.
But could the model be expresed explicitly in a nice way? If so, I would welcome a simple example.

P.S. In principle, the question covers more than just linear quantile regressions, but for practical purposes an answer addressing just the linear case would suffice.

Richard Hardy
  • 67,272
  • 2
    In books I've read, it's been treated as a non parametric regression but apparently the quantile estimator is the maximum likelihood estimator of an asymmetric double exponential distribution. See here http://web1.sph.emory.edu/users/hwu30/teaching/statcomp/Notes/Lecture9_lp2.pdf – machazthegamer Dec 28 '18 at 19:50
  • @machazthegamer, interesting. For a single quantile level $\tau$, your comment (slightly expanded) could serve as an answer. For multiple quantiles $\tau_i$ (a finite number of them) with different corresponding true $\beta^{QR}_{\tau_i}$, the model is probably underdetermined. – Richard Hardy Dec 28 '18 at 21:55
  • @machazthegamer, I would like to accept your answer if you post your extended comment as one. – Richard Hardy Jan 15 '19 at 16:49
  • Could someone expound on what was said here? – Rylan Schaeffer Jun 23 '20 at 03:18
  • @RylanSchaeffer, where exactly? – Richard Hardy Jun 23 '20 at 05:44
  • What @machazthegamer said – Rylan Schaeffer Jun 23 '20 at 15:08
  • @RylanSchaeffer, slide 29 in his link contains the mention of the asymmetric double exponential distribution and its density function. – Richard Hardy Jun 23 '20 at 15:20
  • This article on deep quantile regression (giving you the desired nonlinearity) makes me think the answer is an asymmetric Laplace distribution. – Dave Jan 12 '22 at 21:34
  • @Dave, thank you! For a single quantile the answer indeed seems to be asymmetric Laplace. It is also given in the first comment to this post, just using a different term. My remaining question is about a setting when we have multiple quantiles and a linear model for each of them. I have tried to express this in my penultimate paragraph. (By the way, I found the linked article to be of low quality in terms of statistical rigor and use of terminology. But I guess it is fine for what it is.) – Richard Hardy Jan 13 '22 at 09:21

1 Answers1

1

Question 1: In case of a single quantile, the quantile estimator is the maximum likelihood estimator of an asymmetric double exponential (a.k.a. Laplace) distribution that may look like this: enter image description here (Picture borrowed from Abeywardana "Deep Quantile Regression" (2018).)

Thanks to @machazthegamer and @Dave for helpful links in the comments.

Question 2: In case of multiple quantiles, I doubt there can be a simple expression unless one puts some strong restrictions on the relationships between the slopes at the different quantiles for tractability. (Answers with concrete examples of such restrictions and the resulting tractable distributions are still welcome.)

Richard Hardy
  • 67,272