To start, I have a rudimentary familiarity with the doparallel and parallel packages in R, so please refrain suggesting these packages without example code.
I am currently working with LASSO regression models generated using the glmnet package. The am relying on the cv.glmnet function in this packages to tell me what the ideal lamda is... all of this junk is irreverent to my actual question, but I hope the back story helps. The cv.glmnet function does what I want, but takes too long. I want to parallelize it.
My issue is that the parallel r packages are designed to take a list and then apply an operation to that list, so when I try to pass a polished function like cv.glmnet (even though it is iterative), I get a single core processing the single dataset I want cv.glmnet to process, rather than this process being distributed across all the cores on my server.
Is it possible to distribute a single computation across multiple CPUs/cores in r (which packages, example code, etc)? Or, is it possible to make parallelizing packages, like parallel and doparallel, recognize the iterative structure of the cv.glmnet function and then distribute it for me? I'm fishing for recommendations, any help or insight would be greatly appreciated.
Unfortunately,I do not have permission to share the data I'm working with. For a reproducible example, please see this post, the code from the answer is copy/paste quality to generate data, lasso regressions and gives an example use of the cv.glmnet function: https://stats.stackexchange.com/questions/72251/an-example-lasso-regression-using-glmnet-for-binary-outcome