[R] parallel computation with plyr 1.2.1

Hadley Wickham hadley at rice.edu
Thu Sep 16 20:09:49 CEST 2010


Yes, this was a little bug that will be fixed in the next release.
Hadley

On Thu, Sep 16, 2010 at 1:11 PM, Dylan Beaudette
<debeaudette at ucdavis.edu> wrote:
> Hi,
>
> I have been trying to use the new .parallel argument with the most recent
> version of plyr [1] to speed up some tasks. I can run the example in the NEWS
> file [1], and it seems to be working correctly. However, R will only use a
> single core when I try to apply this same approach with ddply().
>
> 1. http://cran.r-project.org/web/packages/plyr/NEWS
>
> Watching my CPUs I see that in both cases only a single core is used, and they
> take about the same amount of time. Is there a limitation with how ddply()
> dispatches parallel jobs, or is this task not suitable for parallel
> computing?
>
> Cheers,
> Dylan
>
>
> Here is an example:
>
> library(plyr)
> library(doMC)
> registerDoMC(cores=2)
>
> # example data
> d <- data.frame(y=rnorm(1000), id=rep(letters[1:4], each=500))
>
> # function that wastes some time
> f <- function(x) {
> m <- vector(length=10000)
> for(i in 1:10000) {
>        m[i] <- mean(sample(x$y, 100))
>        }
> mean(m)
> }
>
> system.time(ddply(d, .(id), .fun=f, .parallel=FALSE))
> #  user  system elapsed
> #  2.740   0.016   2.766
>
> system.time(ddply(d, .(id), .fun=f, .parallel=TRUE))
> #  user  system elapsed
> #  2.720   0.000   2.726
>
>
>
>
>
> --
> Dylan Beaudette
> Soil Resource Laboratory
> http://casoilresource.lawr.ucdavis.edu/
> University of California at Davis
> 530.754.7341
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/



More information about the R-help mailing list