I'm not quite sure when to use what type of LASSO in Stata 15? I understand the inferential has predictive models but I have no idea what the simply "Lasso" option does? I even read the manual and I still have no idea.
1 Answers
Here's a high-level overview. Lasso can be used in three ways:
- Prediction
- Model selection
- Inference
(1) entails predicting the value of an outcome conditional on a large set of potential regressors, both in and out of sample. (2) entails selecting a set of variables that predicts the outcome well, but not necessarily selecting variables in the "true" model or placing any scientific interpretation on the coefficients. It just means selecting variables that correlate well with the outcome in one dataset and testing whether those same variables predict the outcome well in other datasets.
Inference is concerned with estimating effects of variables in the true model and estimating their standard errors, confidence intervals, p-values, and the like. The standard lasso procedure used for (1) and (2) needs to be modified substantially to allow for (3), and this only works under a strong assumption that only a few variables truly matter (aka sparsity). There are 3 ways of doing that in Stata: double selection, partialing out, and cross-fit partialing out. This is definitely not just a matter of fitting a new model with the lasso-selected variables as controls.
- 35,430
-
Thank you! That clears it up nicely with the 3 tab overview, I was wondering if these were exclusive methods or if they had to be combined somehow which you answered nicely! – Paze Jan 15 '20 at 07:30