0

Say we are fitting a penalized model, such as a linear regression with lasso regularization. We expect to obtain a model with the most significant covariables.

The method starts with many covariables and ends up with just a few, as we use to do with backwards stepwise methods, but the result is obtained in a single step.

Can it be also considered some sort of multitesting or multiple comparison and then we should apply some kind of correction for the p-values, such as Bonferroni?

skan
  • 1,064
  • 2
    In your LASSO output, there should not be any p-values. If there are, for example because they were obtained by bootstrapping, they likely mean something different that you may expect. Certainly, we do not expext LASSO to return "a model with the most significant covariables", with "significant" referring to the size of p-values of regular regression coefficients. You may find more information here: https://stats.stackexchange.com/questions/410173/lasso-regression-p-values-and-coefficients – LuckyPal Jun 26 '23 at 11:45
  • @LuckyPal One of the answers on that link suggests to use the model (the selected covariates) generated by the regularized method to fit a new model without regularization and get the p-values. Or we can also use a bootstrap, as you suggesed. Anyway my question is still there. Do we need to make any kind of "multi-testing correction" – skan Jun 26 '23 at 18:16
  • On the other hand http://archived.stat.ufl.edu/casella/Papers/BL-Final.pdf suggests that there might not be a consensus on a statistically valid method of calculating standard errors for the lasso predictions. Tibshirani, seems to agree (slide 43) that standard errors are still an unresolved issue, http://statweb.stanford.edu/~tibs/ftp/lassotalk.pdf. – skan Jun 26 '23 at 18:20
  • 1
    exactly, p-values for LASSO are controversial and many people recommend not to use them at all. To cite from the questions that I linked to: "What's the actual null hypothesis against which you are testing the coefficient values?". A satisfactory answer to this questions is yet to be found. Only after this, it may be meaningful to talk about a suitable correction for multiple testing. If you have a specific family-wise error rate that you would like to control, then go ahead - but it might be very hard to come up with something meaningful. – LuckyPal Jun 27 '23 at 07:44
  • I've found this paper, where a modified FDR correction is proposed: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3769997/ – skan Jul 09 '23 at 16:06

0 Answers0