[R-sig-hpc] Using rstream for reproducible parallel computations

Renaud Gaujoux renaud at mancala.cbio.uct.ac.za
Tue Feb 8 14:28:24 CET 2011


Hi,

I want to use the package rstream in order to ensure reproducibility of 
parallel independent computations.

To implement the computation I use a foreach loop, which should work 
with both parallel backends doMC and doMPI. The idea I follow to ensure 
reproducibility and flexibility is to create all the required random 
streams before starting loop and pass them to the workers, one by one as 
these complete their tasks. I saw pieces of code around the web that 
does this using the package rlecuyer. I like the unified approach of 
rstream though, so I would like to use this package (that also provides 
L'Ecuyer's RNGstream generator). Parallel stuff can get tricky with 
hidden side effects, that's why I'd like to have some comments on the 
following code.
Thank you.

Renaud

# this function creates n streams, starting with a given seed ( <=> 
rlecuyer::.lec.CreateStream)
createStream <- function(n, seed){

     # check parameters
     if( n <= 0 )
         stop("NMF::createStream - invalid value for 'n' [positive value 
expected]")

     s <- new('rstream.mrg32k3a', seed=seed, force.seed=TRUE)
     rstream.packed(s) <- TRUE
     s <- list(s)
     if( n > 1 )
         s <- c(s, replicate(n-1,{ s <- new('rstream.mrg32k3a'); 
rstream.packed(s) <- TRUE; s} ))

     invisible(s)
}

library(rstream)
library(doMC)

registerDoMC()
dummy <- foreach(i=1:10, s=createStream(10, 1:6)) %dopar% {
     print(s)
     # do stuff
}



 

###
UNIVERSITY OF CAPE TOWN 

This e-mail is subject to the UCT ICT policies and e-mai...{{dropped:5}}



More information about the R-sig-hpc mailing list