[R-sig-hpc] Problems parallelizing glmnet

Simon Urbanek simon.urbanek at r-project.org
Thu Sep 6 19:41:06 CEST 2012


On Sep 6, 2012, at 11:50 AM, Patrik Waldmann <patrik.waldmann at boku.ac.at> wrote:

> I want to run the cv.glmnet function with the same data (y and x) with different values on the alpha parameter determined by the number of cores, but the result is absurd. What is wrong in the code below?
> 

You're evaluating exactly the same expression on all nodes ... I don't think you intended that (you are passing the alphasplit list as alpha to all of them - I don't think that makes sense). Isn't this closer to the intention:

alphas <- seq(0, 1, length.out= cores)
out <- clusterApply(cl, alphas, function(alpha) cv.glmnet(x,y,alpha=alpha))

Cheers,
Simon



> Patrik Waldmann
> 
> x <- matrix(rnorm(2000*10000),ncol=10000)
> y <- matrix(rnorm(2000),ncol=1)
> 
> library(parallel)
> cvglmnet <- function(...) {
> library(glmnet)
> cv.glmnet(x,y,alpha=alphasplit)
> }
> system.time(cores<-detectCores())
> system.time(cl <- makeCluster(cores, methods=FALSE))
> alpha<-seq(0, 1,by=1/(cores-1))
> alphasplit<-clusterSplit(cl,alpha)
> system.time(clusterExport(cl, c("x","y","cvglmnet","alphasplit")))
> system.time(outbrlist<-clusterEvalQ(cl, cvglmnet(x,y,alphasplit)))
> system.time(totoutbr<-do.call(cbind,outbrlist))
> stopCluster(cl)
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
> 
> 



More information about the R-sig-hpc mailing list