[R-sig-hpc] parallel library's random numbers
Ross Boylan
ross at biostat.ucsf.edu
Thu Dec 19 23:08:43 CET 2013
On Thu, 2013-12-19 at 13:28 -0800, Ross Boylan wrote:
> Conceptually, I have 2,000 tasks and I want each task to have access to
> an independent, repeatable random number stream. Many of the tasks will
> run on the same node. For example, a typical job would be to do tasks
> 200 to 400.
>
> I might do one run with 10 nodes and a later one with 50, but I want the
> same streams for each task.
>
> The facilities in parallel seem designed to achieve repeatable
> randomization for a fixed number of nodes. Is there a way to get them
> to do what I want?
I'm thinking maybe I should just ignore the parallel random number stuff
and do set.seed(t) before task t. It's easy to imagine that is not an
entirely safe method, however.
Ross
>
> The documentation does not explain what nextRNGStream and
> nextRNGSubStream do nor how that relates to the initialization done by
> clusterSetRNGStream. For example, if I do nextRNGStream on node 1, do I
> get the same stream as is being used on node 2?
>
> The documentation does not say that calling nextRNGStream actually
> resets the seed, though that seems to be implicit in the example (and is
> explicit in one book I found on the net). I'm also unsure if calling
> mc.reset.stream() is necessary after calling clusterSetRNGStream.
>
> Ross Boylan
>
> P.S. Using R 3.0.1
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
More information about the R-sig-hpc
mailing list