Lasso Regression: all coefficients are zero

Question

I'm getting crazy with this. I hope that someone can help me.

I want to perform LASSO regression on "House Prices: Advanced Regression Techniques" dataset on Kaggle. I'm using R.

This dataset is about predicting house price based on some features.

Data is messy. It has a lot of missing values. So, in order to solve this problem, I began to observe the meaning of the variables and I noticed that almost all missing values are the absence of a feature of the house. For example, in the dataset, there's a categorical variable that measures the type of alley access to property. So, in this variable, NA means "No alley", so I add a proper level to the variable. And so on with almost all variables.

At this point, I'm using glmnet library to perform lasso regression. glment function works with a matrix, so i use model.matrixin order to have a matrix that transform categorical variables in dummy variables. Then i combine with numeric variable and pass to glmnet. CVglmentselects best lambdas and it returns me this absurd lambda( 2540) that brings all coefficients to zero( i know why, due to the penalty factor).

library(glmnet)

house<-house[,-which(names(house)=="GarageYrBlt")]
numeric_house<-house[,sapply(house,is.numeric)]
numeric_house<-numeric_house[,-which(names(house)=="SalePrice")]
X <- model.matrix(house$SalePrice ~ .,data=house[,sapply(house,is.factor)])[,-1]
x.lasso <- as.matrix(data.frame(numeric_house, X))
y<-house$SalePrice 

fit.lassoKCV<-cv.glmnet(x.lasso,y, alpha=1, nfolds = 5)
( lambda.KCV<-fit.lassoKCV$lambda.min )

So i tried with other values of lambda

grid<-seq(1,100,length=1000)
fit.lasso<-glmnet(x.lasso,y,alpha=1,lambda=grid,standardize = T)

but all coefficients to zero anyway.

My response variable( SalePrice) isn't in the matrix x.lasso. just checked out several times. Please help me, i'm desperate. I don't know if the problem is in my code or is a theoretichal problem( multicollinearity???).

Lasso Regression: all coefficients are zero

0 Answers0

Linked