[R-sig-hpc] doMC and reproducible parallel numbers under plyr

Stephen Weston stephen.b.weston at gmail.com
Wed Feb 26 20:08:21 CET 2014


Since the doMC package now uses the "parallel" package, you can use
the same techniques as when using mclapply directly. The documentation
(actually for mcparallel) says:

     The behaviour with 'mc.set.seed = TRUE' is different only if
     'RNGkind("L'Ecuyer-CMRG")' has been selected.  Then each time a
     child is forked it is given the next stream (see 'nextRNGStream').
     So if you select that generator, set a seed and call
     'mc.reset.stream' just before the first use of 'mcparallel' the
     results of simulations will be reproducible provided the same
     tasks are given to the first, second, ...  forked process.

I haven't tried this with plyr, but it's worth try.

- Steve

On Wed, Feb 26, 2014 at 1:06 PM, Aaron King <kingaa at umich.edu> wrote:
> I've run into a little bit of frustration trying to combine plyr, foreach,
> and doMC to get reproducible results.  It's straightforward to achieve
> reproducibility when using one of the other foreach backends (doSNOW,
> doMPI, for instance), but there are times when you want to take advantage
> of the shared-memory capacity of multicore machines.  doRNG is nice in that
> it makes it easy to get fully reproducible results when using foreach
> directly, but when you foreach only via plyr + .parallel=TRUE, you don't
> have the option of using doRNG.
>
> Has anyone figured out how to get fully reproducible results when using
> plyr + doMC?
>
> Aaron
>
> --
> Aaron A. King, Ph.D.
> Ecology & Evolutionary Biology
> Mathematics
> Center for the Study of Complex Systems
> University of Michigan
> GPG Public Key: 0x15780975
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc



More information about the R-sig-hpc mailing list