[R-sig-hpc] Using rstream for reproducible parallel computations

Dirk Eddelbuettel edd at debian.org
Tue Feb 8 15:02:29 CET 2011


On 8 February 2011 at 15:28, Renaud Gaujoux wrote:
| Hi,
| 
| I want to use the package rstream in order to ensure reproducibility of 
| parallel independent computations.
| 
| To implement the computation I use a foreach loop, which should work 
| with both parallel backends doMC and doMPI. The idea I follow to ensure 
| reproducibility and flexibility is to create all the required random 
| streams before starting loop and pass them to the workers, one by one as 
| these complete their tasks. I saw pieces of code around the web that 
| does this using the package rlecuyer. I like the unified approach of 
| rstream though, so I would like to use this package (that also provides 
| L'Ecuyer's RNGstream generator). Parallel stuff can get tricky with 
| hidden side effects, that's why I'd like to have some comments on the 
| following code.

It's tricky and important to get this stuff right: The foreach and do*
families do not support it out of the box.

You could try to make your life easier by switching to the snow package. With
the doSNOW plugin and by using the MC and/or MPI backends supported by snow
you would get the snow integration to rlecuyer and rsprng for free.

That said, it would be awfully nice if someone could work similar logic into
do* as a general solution.

Dirk

| Thank you.
| 
| Renaud
| 
| # this function creates n streams, starting with a given seed ( <=> 
| rlecuyer::.lec.CreateStream)
| createStream <- function(n, seed){
| 
|      # check parameters
|      if( n <= 0 )
|          stop("NMF::createStream - invalid value for 'n' [positive value 
| expected]")
| 
|      s <- new('rstream.mrg32k3a', seed=seed, force.seed=TRUE)
|      rstream.packed(s) <- TRUE
|      s <- list(s)
|      if( n > 1 )
|          s <- c(s, replicate(n-1,{ s <- new('rstream.mrg32k3a'); 
| rstream.packed(s) <- TRUE; s} ))
| 
|      invisible(s)
| }
| 
| library(rstream)
| library(doMC)
| 
| registerDoMC()
| dummy <- foreach(i=1:10, s=createStream(10, 1:6)) %dopar% {
|      print(s)
|      # do stuff
| }
| 
| 
| 
|  
| 
| ###
| UNIVERSITY OF CAPE TOWN 
| 
| This e-mail is subject to the UCT ICT policies and e-mai...{{dropped:5}}
| 
| _______________________________________________
| R-sig-hpc mailing list
| R-sig-hpc at r-project.org
| https://stat.ethz.ch/mailman/listinfo/r-sig-hpc

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com



More information about the R-sig-hpc mailing list