[R-sig-hpc] Problems parallelizing glmnet

Stephen Weston stephen.b.weston at gmail.com
Thu Sep 6 22:59:31 CEST 2012


In this example, you're comparing doParallel using a
PSOCK cluster with mclapply.  A better comparison would
use doParallel with mclapply by registering doParallel
with:

registerDoParallel(cores=8)

That will give performance much closer to mclapply,
although still not as good due to the extremely short
tasks.  With very short tasks, the overhead is
magnified.

For a single (non-Windows) machine, there's no need to
create a cluster object at all, either for doParallel or for
mclapply.  You should only create a cluster object
if you're running on multiple nodes or a Windows
machine, but in those cases, you can't use mclapply.

- Steve


On Thu, Sep 6, 2012 at 4:16 PM, Patrik Waldmann
<patrik.waldmann at boku.ac.at> wrote:
> y<-rnorm(1000)
> x<-matrix(rnorm(1000*10000),ncol=10000)
> dimx<-dim(x)
>
> library(doParallel)
> library(foreach)
> cl <- makeCluster(8, methods=FALSE)
> registerDoParallel(cl)
> print(system.time(
> pval <- foreach (i =1:dimx[2], .combine=c) %dopar% {
> mod <- lm(y ~ x[,i])
> summary(mod)$coefficients[2,4]
> }
> ))
>
>   user  system elapsed
>  12.28    2.75  231.93
>
> stopCluster(cl)
>
> library(parallel)
> cl <- makeCluster(8, methods=FALSE)
> print(system.time(
> pval <- unlist(mclapply(1:dimx[2], function(i) summary(lm(y ~ x[,i]))$coefficients[2,4]))
> ))
>
>   user  system elapsed
>  21.80    1.33   25.78
>
> stopCluster(cl)
>
>>>> "Brian G. Peterson" <brian at braverock.com> 09/06/12 20:03 PM >>>
> On 09/06/2012 12:15 PM, Patrik Waldmann wrote:
>> I would like to avoid foreach since we showed earlier that it is VERY slow.
>
> A search of the list archives doesn't provide any backup that you have
> 'showed' anything.
>
> The only other post by you that I can see on the list shows the total
> elapsed time decreasing with the use of foreach, even though it was,
> indeed, a trivial function evaluation.
>
> --
> Brian
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc



More information about the R-sig-hpc mailing list