[R-sig-hpc] Using rstream for reproducible parallel computations
Renaud Gaujoux
renaud at mancala.cbio.uct.ac.za
Tue Feb 8 14:28:24 CET 2011
Hi,
I want to use the package rstream in order to ensure reproducibility of
parallel independent computations.
To implement the computation I use a foreach loop, which should work
with both parallel backends doMC and doMPI. The idea I follow to ensure
reproducibility and flexibility is to create all the required random
streams before starting loop and pass them to the workers, one by one as
these complete their tasks. I saw pieces of code around the web that
does this using the package rlecuyer. I like the unified approach of
rstream though, so I would like to use this package (that also provides
L'Ecuyer's RNGstream generator). Parallel stuff can get tricky with
hidden side effects, that's why I'd like to have some comments on the
following code.
Thank you.
Renaud
# this function creates n streams, starting with a given seed ( <=>
rlecuyer::.lec.CreateStream)
createStream <- function(n, seed){
# check parameters
if( n <= 0 )
stop("NMF::createStream - invalid value for 'n' [positive value
expected]")
s <- new('rstream.mrg32k3a', seed=seed, force.seed=TRUE)
rstream.packed(s) <- TRUE
s <- list(s)
if( n > 1 )
s <- c(s, replicate(n-1,{ s <- new('rstream.mrg32k3a');
rstream.packed(s) <- TRUE; s} ))
invisible(s)
}
library(rstream)
library(doMC)
registerDoMC()
dummy <- foreach(i=1:10, s=createStream(10, 1:6)) %dopar% {
print(s)
# do stuff
}
###
UNIVERSITY OF CAPE TOWN
This e-mail is subject to the UCT ICT policies and e-mai...{{dropped:5}}
More information about the R-sig-hpc
mailing list