Lasso regression coefficients for categorical variable

Question

I am doing lasso regression to understand the influential variables from a lists of 65 odd variables that affect the liquor consumptions of an individual.

The independent variables are combination of categorical and numeric variable like State, Education, Sex, Age, income....

Glmnet package has been used and lambda is decided based on cross validation

  fit = glmnet(x, y, alpha = 1,lambda= 0.072,thresh = 1e-12)

The lasso has given list of 25 variables with non zero coefficient and rest all 0.

The Beta values are as below

  fit$beta
State           -0.350
  Education       -0.254
  Age              0.175 
  Sex              .
  ...              ....

Education is a categorical variables with 5 levels - No school, High school, Graduate, Masters, Doctorate. Unlike linear regression which would give 4 beta estimates for each unique level and one will be used as reference in Lasso it gives only one Beta for Education. I am not able to interpret these beta for categorical variable(factor variable).

How to interpret those lasso coefficients and the signs
For numeric variable like Age is it to be interpreted same as in linear regression

I got some clue here Categorical variables in LASSO regression but not sure how to relate that with the beta that I got here.

glmnet is not able to handle categorical variables directly, you need to convert them to dummy variables as described here — drmaettu, Oct 22 '21 at 08:30
Is there any other package/function that creates dummy automatically like lm or glm function does? — joy_1379, Oct 22 '21 at 10:26
hmm I've always used glmnet so I'm not aware of such a package. But converting to dummies is really easy, just write:
fit = glmnet(model.matrix( ~ . -1, x), y).

What this does is it creates a design matrix without the intercept (hence the -1), which will be taken care of by gmlnet. — drmaettu, Oct 22 '21 at 12:28
another option is to use makeX from the glmnet package: glmnet(makeX(x), y, ...) — schotti, Jan 05 '23 at 16:23

Lasso regression coefficients for categorical variable

0 Answers0