[Bioc-devel] how to achieve reproducibility with BiocParallel regardless of number of threads and OS (set.seed is disallowed)

Martin Morgan mtmorg@n@bioc @ending from gm@il@com
Mon Dec 31 19:12:39 CET 2018


The major BiocParallel objects (SnowParam(), MulticoreParam()) and use of bplapply() allow fully repeatable randomizations, e.g.,

> library(BiocParallel)
> unlist(bplapply(1:4, rnorm, BPPARAM=MulticoreParam(RNGseed=123)))
 [1] -0.96859273 -0.40944544  0.89096942 -0.48906078  0.43304237 -0.03195349
 [7] -1.03886641  1.57451249  0.74708204  0.67187201
> unlist(bplapply(1:4, rnorm, BPPARAM=MulticoreParam(RNGseed=123)))
 [1] -0.96859273 -0.40944544  0.89096942 -0.48906078  0.43304237 -0.03195349
 [7] -1.03886641  1.57451249  0.74708204  0.67187201
> unlist(bplapply(1:4, rnorm, BPPARAM=SnowParam(RNGseed=123)))
[1] -0.96859273 -0.40944544  0.89096942 -0.48906078  0.43304237 -0.03195349
 [7] -1.03886641  1.57451249  0.74708204  0.67187201

The idea then would be to tell the user to register() such a param, or to write your function to accept an argument rngSeed along the lines of

f = function(..., rngSeed = NULL) {
    if (!is.null(rngSeed)) {
        param = bpparam()  # user's preferred back-end
        oseed = bpRNGseed(param)
        on.exit(bpRNGseed(param) <- oseed)
        bpRNGseed(param) = rngSeed
    }
    bplapply(1:4, rnorm)
}

(actually, this exercise illustrates a problem with bpRNGseed<-() when the original seed is NULL; this will be fixed in the next day or so...)

Is that sufficient for your use case?

On 12/31/18, 11:24 AM, "Bioc-devel on behalf of Lulu Chen" <bioc-devel-bounces using r-project.org on behalf of luluchen using vt.edu> wrote:

    Dear all,
    
    I posted the question in the Bioconductor support site (
    https://support.bioconductor.org/p/116381/) and was suggested to direct
    future correspondence there.
    
    I plan to generate a vector of seeds (provided by users through argument of
    my R function) and use them by set.seed() in each parallel computation.
    However, set.seed() will cause warning in BiocCheck().
    
    Someone suggested to re-write code using c++, which is a good idea. But it
    will take me much more extra time to re-write some functions from other
    packages, e.g. eBayes() in limma.
    
    Hope to get more suggestions from you. Thanks a lot!
    
    Best,
    Lulu
    
    	[[alternative HTML version deleted]]
    
    _______________________________________________
    Bioc-devel using r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
    


More information about the Bioc-devel mailing list