I know, the question has been posted many times, but none of the answers fixed my problem. I still get different results each time I run the cv.glmnet on my data. Here is my code:
set.seed(123)
library(caret)
library(tidyverse)
library(glmnet)
library(ROCR)
library(doParallel)
registerDoParallel(4, cores = 8)
df <- df %>% select(V1, V2, V3, V4, V5, V6, V7, V8, V9, V10, V11, V12, V13)
training.samples <- df$V2 %>% createDataPartition(p = 0.8, list = FALSE)
train <- df[training.samples, ]
test <- df[-training.samples, ]
x.train <- data.frame(train[, names(train) != "V2"])
x.train <- data.matrix(x.train)
y.train <- train$V2
x.test <- data.frame(test[, names(test) != "V2"])
x.test <- data.matrix(x.test)
y.test <- test$V2
list.of.fits <- list()
for (i in 0:10){
fit.name <- paste0("alpha", i/10)
list.of.fits[[fit.name]] <- cv.glmnet(x.train, y.train, type.measure = c("auc"), alpha = i/10, family = "binomial", parallel = TRUE)
}
coef <- coef(list.of.fits[[fit.name]], s = list.of.fits[[fit.name]]$lambda.1se)
coef
I then re-visited other's similar problem, like here, and tried to fix the nfolds to 5 and foldid to foldid <- sample(rep(seq(5), length.out = nrow(train))) so it ended up like this: list.of.fits[[fit.name]] <- cv.glmnet(x.train, y.train, type.measure = c("auc"), alpha = i/10, family = "binomial", nfolds = 5, foldid = foldid, parallel = TRUE).
But, I still get very different results when I re-run my cv.glmnet on the exact same data. What do I do wrong here, since I get different results every time, even after the 'fix'?
alphaprior to my model run, as in my scriptalpha = i/10. So will I always get differentlambdasand, therefore, differentcoef? – Thomas Jul 25 '20 at 09:25lapply(alphas,function(i){cv.glmnet(Xtrain,Ytrain,alpha = i, family = "binomial",measure="AUC",foldid=foldid); as.matrix(coef(fit,...)) })– StupidWolf Jul 25 '20 at 09:27fits = lapply(1:10,function(i){cv.glmnet(x.train, y.train, alpha = i/10, family = "binomial", measure = "auc", foldid = foldid, parallel = TRUE)}), but I still get differentlambdas. How can that be? I think it has something to do with the non-definedalphaof mine, right? – Thomas Jul 25 '20 at 14:19coefficientswhen retrieved from the run. I have no idea what I am doing wrong. But thefoldiddoes not help to diminish the 'randomness' in my coefficient output, unfortunately. – Thomas Jul 25 '20 at 15:51nfolds = 10 foldid = 1 + (1:nrow(Xtrain) %% nfolds)apparently helped rather than thefoldidI showed in my example. Thank you. – Thomas Jul 27 '20 at 07:29