8

Lasso-like methods have become pretty common in applied statistics but the Dantzig selector remains unpopular despite having great properties (minimax optimality). Why hasn't it become more popular?

  • I suspect this question can't be answered like this. Perhaps you can refine it as laid out in the FAQ http://stats.stackexchange.com/faq ? Maybe ask for advantages/disadvantages, applicability, implementation etc. – Momo May 16 '13 at 21:09
  • 3
    I suspect because it is fairly new, and, as far as I can tell from 5 minutes of googling, no standard implementation exists that applied statisticians could use on their own data. It seems like a cool method. If the authors would write a package for R and python to implement their method, I'd love to try it out. – Zach May 16 '13 at 21:17
  • 4
    @Zach, I think the flare package for R implements the Dantzig selector. I have no experience with it though. – COOLSerdash May 16 '13 at 21:32
  • 3
    You might find this paper of interest. Outside that, I don't have a good answer for your question other than the lasso has been around a decade longer. – Glen_b May 16 '13 at 23:57
  • 2
    And this paper too. In short it has a lower MSE than lasso, a less natural loss function (Chebychev norms). It was designed to behave better when the regressors are orthogonal but this is, by definition, never the situation of interest in multivariate regression. Frankly, I'm a bit surprised it has been studied so much in theoretical/computational statistics. – user603 May 17 '13 at 01:16
  • @user603, good point but why not add a $L_2$ norm component, like in the elastic net? I'm really curious to test it out more extensively once I get some time. – Piotr Sokol May 17 '13 at 08:44
  • @PiotrSokol: I think the problems come from the $L_{\infty}$ norm in the fit term. Adding a $L_2$ term on the coefficients would not address that. – user603 May 17 '13 at 08:46

1 Answers1

6

The $\ell_\infty$ loss term is VERY sensitive to outliers.

Most (all?) of the theory for the Dantzig selector is under the assumption of normal / Gaussian errors. With this error distribution, there isn't much difference between $\ell_2$ loss and $\ell_\infty$ loss. However, with real data, we would like to be less sensitive to outliers.

Nick Cox
  • 56,404
  • 8
  • 127
  • 185
karl
  • 76
  • 1
    I don't think this is true. The $\ell_\infty$ loss in Dantzig is not applied to individual errors, but to the vector of correlations between the residual error and each of the explanatory variables. I don't see how this would make it more sensitive to outliers than a $\ell_2$ error term on the residual error. In fact, note that in the extreme case when you force the $\ell_\infty$ bounds in Dantzig to be zero, you obtain exactly least-squares linear regression. – david Nov 03 '19 at 17:14