[Bioc-devel] Using SerialParam() as the registered back-end for all platforms
rct @ending from thomp@oncl@n@org
Tue Jan 8 16:13:08 CET 2019
On Mon, Jan 7, 2019 at 3:26 PM Henrik Bengtsson <henrik.bengtsson using gmail.com>
> 1. To achieve fully numerically reproducible RNGs in way that is
> *invariant to the number of workers* (amount of chunking), I think the
> only solution is to pregenerated RNG seeds (using
> parallel::nextRNGStream()) for each individual iteration (element).
> In other words, if a worker will process K elements, then the main R
> process needs to generate K RNG seeds and pass those along to the
> work. I use this approach for future.apply::future_lapply(...,
> future.seed = TRUE/<initial_seed>), which then produce identical RNG
> results regardless of backend and amount of chunking. In the past, I
> think I've seen Martin suggesting something similar as a manual
> approach to some users.
> 2. The above approach is obviously expensive, especially when there
> are a large number of elements to iterate over. Because of this I'm
> thinking providing an option to use only one RNG seed per worker
> (which is the common approach used elsewhere)
> [https://github.com/HenrikBengtsson/future.apply/issues/20]. This
> won't be invariant to the number of workers, but it "should" still be
> statistically sound. This approach will give reproducible RNG results
> given the same initial seed and the same amount of chunking.
> 3. For algorithms which do not rely on RNG, we can ignore both of the
> above. The problem is that it's not always known to the
> user/developer which methods depend on RNG or not. The above 'RNG
> tracker' helps to identify some, but things might also change over
> time. I believe there's room for automating this in one way or the
> other. For instance, having a way to declare a function being
> dependent on RNG or not could help. Static code inspection could also
> do it, e.g. when an R package is built and it could be part of the R
> CMD checks to validate.
> 4. Are there other approaches?
I don't suppose it's possible to quickly determine via static analysis
whether a piece of code uses the RNG?
[[alternative HTML version deleted]]
More information about the Bioc-devel