[Bioc-devel] how to achieve reproducibility with BiocParallel regardless of number of threads and OS (set.seed is disallowed)
Martin Morgan
mtmorg@n@bioc @ending from gm@il@com
Mon Dec 31 19:12:39 CET 2018
The major BiocParallel objects (SnowParam(), MulticoreParam()) and use of bplapply() allow fully repeatable randomizations, e.g.,
> library(BiocParallel)
> unlist(bplapply(1:4, rnorm, BPPARAM=MulticoreParam(RNGseed=123)))
[1] -0.96859273 -0.40944544 0.89096942 -0.48906078 0.43304237 -0.03195349
[7] -1.03886641 1.57451249 0.74708204 0.67187201
> unlist(bplapply(1:4, rnorm, BPPARAM=MulticoreParam(RNGseed=123)))
[1] -0.96859273 -0.40944544 0.89096942 -0.48906078 0.43304237 -0.03195349
[7] -1.03886641 1.57451249 0.74708204 0.67187201
> unlist(bplapply(1:4, rnorm, BPPARAM=SnowParam(RNGseed=123)))
[1] -0.96859273 -0.40944544 0.89096942 -0.48906078 0.43304237 -0.03195349
[7] -1.03886641 1.57451249 0.74708204 0.67187201
The idea then would be to tell the user to register() such a param, or to write your function to accept an argument rngSeed along the lines of
f = function(..., rngSeed = NULL) {
if (!is.null(rngSeed)) {
param = bpparam() # user's preferred back-end
oseed = bpRNGseed(param)
on.exit(bpRNGseed(param) <- oseed)
bpRNGseed(param) = rngSeed
}
bplapply(1:4, rnorm)
}
(actually, this exercise illustrates a problem with bpRNGseed<-() when the original seed is NULL; this will be fixed in the next day or so...)
Is that sufficient for your use case?
On 12/31/18, 11:24 AM, "Bioc-devel on behalf of Lulu Chen" <bioc-devel-bounces using r-project.org on behalf of luluchen using vt.edu> wrote:
Dear all,
I posted the question in the Bioconductor support site (
https://support.bioconductor.org/p/116381/) and was suggested to direct
future correspondence there.
I plan to generate a vector of seeds (provided by users through argument of
my R function) and use them by set.seed() in each parallel computation.
However, set.seed() will cause warning in BiocCheck().
Someone suggested to re-write code using c++, which is a good idea. But it
will take me much more extra time to re-write some functions from other
packages, e.g. eBayes() in limma.
Hope to get more suggestions from you. Thanks a lot!
Best,
Lulu
[[alternative HTML version deleted]]
_______________________________________________
Bioc-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
More information about the Bioc-devel
mailing list