[R-sig-hpc] parallel library's random numbers

George Ostrouchov georgeost at gmail.com
Fri Dec 20 04:24:26 CET 2013


Ross,

Consider writing this SPMD style and run as Rscript. Here is a short 
example:

library(pbdMPI, quiet=TRUE)
init()
comm.set.seed(1234567, diff=TRUE)
r <- runif(5)
comm.print(r, all.rank=TRUE)
finalize()

Put the above 6 lines of R code into file rand.r and then run
mpirun -np 2 Rscript rand.r
Experiment with diff=FALSE and with more than 2 nodes. Then start 
reading the vignette for the pbdDEMOpackage.

If diff=TRUE, the streams are independent. This uses the package rlecuyer.

Good luck!

George

P.S. I assume your 50 nodes are in a cluster, probably managed with PBS. 
Then the mpirun is issued after a "qsub -I ..." interactive allocation.

On 12/19/13 5:08 PM, Ross Boylan wrote:
> On Thu, 2013-12-19 at 13:28 -0800, Ross Boylan wrote:
>> Conceptually, I have 2,000 tasks and I want each task to have access to
>> an independent, repeatable random number stream.  Many of the tasks will
>> run on the same node.  For example, a typical job would be to do tasks
>> 200 to 400.
>>
>> I might do one run with 10 nodes and a later one with 50, but I want the
>> same streams for each task.
>>
>> The facilities in parallel seem designed to achieve repeatable
>> randomization for a fixed number of nodes.  Is there a way to get them
>> to do what I want?
> I'm thinking maybe I should just ignore the parallel random number stuff
> and do set.seed(t) before task t.  It's easy to imagine that is not an
> entirely safe method, however.
>
> Ross
>> The documentation does not explain what nextRNGStream and
>> nextRNGSubStream do nor how that relates to the initialization done by
>> clusterSetRNGStream.  For example, if I do nextRNGStream on node 1, do I
>> get the same stream as is being used on node 2?
>>
>> The documentation does not say that calling nextRNGStream actually
>> resets the seed, though that seems to be implicit in the example (and is
>> explicit in one book I found on the net).  I'm also unsure if calling
>> mc.reset.stream() is necessary after calling clusterSetRNGStream.
>>
>> Ross Boylan
>>
>> P.S. Using R 3.0.1
>>
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>



More information about the R-sig-hpc mailing list