[R-sig-hpc] creating many separate streams (more streams than nodes)

Paul Johnson pauljohn32 at gmail.com
Thu Apr 21 00:49:58 CEST 2011

>> Paul
> Parallel random number generators are supposed to behave well in exactly
> this scenario.
> Ross

The rlecuyer package has this theory behind it.  Suppose the random
stream is like this:


That is long enough that you can divide it into pieces and use them
separately for separate jobs.
There are not really 8000 separate generators, there are 8000 chunks
out of the one long set of numbers.

  for job 1            2                           3
            4                      5

So if you believe the 1 long stream is good, each individual piece is OK.

This is published in the Lecuyer essay I mentioned in the first post,
and so far as I know nobody has torn after it.

In the snowFT code that Hana pointed me toward, the way they do this
is clever.  I would have had to fight with it.   On each node,
initiate the same 8000 streams, then when you run a job, just have the
function you use grab the appropriate stream.

As far as the publications on the verification of the SPRNG way, well,
there are some, but I can't say if they are credible or not.  That
approach spawns the generators with slightly different parameters. In
theory , I find it more appealing, but the folks who know details are
more dubious about it. Here's the one definite site I have.

        title = {Testing parallel random number generators},
        volume = {29},
        number = {1},
        journal = {Parallel Comput.},
        author = {Ashok Srinivasan and Michael Mascagni and David Ceperley},
        year = {2003},
        pages = {69--94}

Since rsprng is in bad shape, I don't know that a person really ought
to pursue that at the moment.

Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

More information about the R-sig-hpc mailing list