What regularizer to use for small datasets?

Question

If I have a sparse dataset with very few points, which regularization scheme should I use?

That is, I have a dataset with only 10 points. Are there regularizers that would help me in this situation?

One thing to keep in mind is that a lot of regularization schemes correspond to priors, and with very little data you often just get your prior back — jld, Feb 11 '18 at 05:25

score 3 · Answer 1 · answered Feb 11 '18 at 02:42

3

If you have 10 points, I would say the only regularization "scheme" you'd even want to remotely consider is the fully bayesian one. Look up bayesian regression, bayesian testing, or whatever you're trying to do. There are packages that will help you, but you'll want to be involved in the priors. In many larger data situations, you're not as worried about them (for typical uses of L2 regularization, you've effectively got a prior mean of 0), but with your sample size, they are going to matter. Set them carefully, by thinking about what you know and don't know. You'll get "regularization," but it doesn't have to be the kind that just makes your coefficients smaller. Rather it could "shrink" them closer to a non-zero prior mean value, for instance.

answered Feb 11 '18 at 02:42

Ben Ogorek

5,357

3

+1 for come up Bayesian, prior is stronger than l1, or l2 regularization. – Haitao Du Feb 11 '18 at 02:49
Thx. I do agree with your answer too. Technically regularization is an independent concept from sample size. Only thing is, with cross-validation typically being used to tune the L1 or L2 regularization parameter, how well would that work with 10 points? Maybe leave-one-out CV would still work just fine? – Ben Ogorek Feb 11 '18 at 02:52
Given 10 points only the prior is critical here. – WestCoastProjects Feb 11 '18 at 02:54
How do I add a Bayesian regularization prior to a linear model? – echo Feb 11 '18 at 19:00
@echo, you never told us what you were trying to do. Are you estimating a mean, a 5th-order polynomial, doing a test, etc.? – Ben Ogorek Feb 11 '18 at 20:02
I'm building a predictive model to predict one of the columns. – echo Feb 11 '18 at 20:28
You could tweak this R code for simple linear regression: http://www4.stat.ncsu.edu/~reich/ABA/code/SLR. – Ben Ogorek Feb 12 '18 at 13:20

Haitao Du · Answer 2 · 2018-02-11T02:37:39.627

Which regularization scheme to use is not depending on how much data you have. For example, L1, L2 regularization can be used in both big or small data set.

On the other hand, how much you want to regularize is depending on your data size and the complexity of the model. Suppose we use L2 on polynomial fit for $10$ data points. If you want to use $5th$ order model, it is better to set $\lambda$ larger, comparing to you want to fit with $3rd$ order model.

What regularizer to use for small datasets?

2 Answers2