[Bioc-devel] loading database package changes random number

Steffi Grote @te|||_grote @end|ng |rom ev@@mpg@de
Wed May 22 15:30:37 CEST 2019


Hi all,

I tried to circumvent the problem by adding an optional seed as parameter like this:

my_fun = function(..., seed = NULL){
    
    code that might change the RNG

    if (!is.null(seed)){
        set.seed(seed)
    }    

    code that runs permutations   
}

which solves the reproducibility issue, but gives me a Warning in BiocCheck:
    * WARNING: Remove set.seed usage in R code
      Found in R/ directory functions:
        my_fun()

What is the best way to deal with this?

Thanks in advance,
Steffi


> On April 12, 2019 at 1:10 AM Martin Morgan <mtmorgan.bioc using gmail.com> wrote:
> 
> 
> That easy strategy wouldn't work, for instance two successive calls to MulticoreParam() would get the same port assigned, rather than the contract of a 'random' port in a specific range; the port can be assigned by the manager.port= argument if the user wants to avoid random assignment. I could maintain a separate random number stream in BiocParallel for what amounts to a pretty trivial and probably dubious strategy [choosing random ports in hopes that one is not in use], but that starts to sound like a more substantial feature.
> 
> Martin
> 
> On 4/11/19, 7:06 PM, "Pages, Herve" <hpages using fredhutch.org> wrote:
> 
>     Hi Steffi,
>     
>     Any code that gets called between your calls to set.seed() and runif() 
>     could potentially use the random number generator. So the sequence 
>     set.seed(123); runif(1) is only guaranteed to be deterministic if no 
>     other code is called in between, or if the code called in between does 
>     not use the random number generator (but if that code is not under your 
>     control it could do anything).
>     
>     @Martin: I'll look at your suggestion for DelayedArray. An easy 
>     workaround would be to avoid changing the RNG state in BiocParallel by 
>     having .snowPort() make a copy of .Random.seed (if it exists) before 
>     calling runif() and restoring it on exit.
>     
>     H.
>     
>     On 4/11/19 15:25, Martin Morgan wrote:
>     > This is actually from a dependency DelayedArray which, on load, calls DelayedArray::setAutoBPPARAM, which calls BiocParallel::MulticoreParam(), which uses the random number generator to select a random port for connection.
>     >
>     > A different approach would be for DelayedArray to respect the user's configuration and use bpparam(), or perhaps look at the class of bpparam() and tell the user they should, e.g., BiocParallel::register(SerialParam()) if that's appropriate, or use registered("MulticoreParam") or registered("SerialParam") if available (they are by default) rather than creating an ad-hoc instance.
>     >
>     > Martin
>     >
>     > On 4/11/19, 10:17 AM, "Bioc-devel on behalf of Steffi Grote" <bioc-devel-bounces using r-project.org on behalf of steffi_grote using eva.mpg.de> wrote:
>     >
>     >      Hi all,
>     >      I found out that example code for my package GOfuncR yields a different result the first time it's executed, despite setting a seed. All the following executions are identical.
>     >      It turned out that loading the database package 'Homo.sapiens' changed the random numbers:
>     >      
>     >      set.seed(123)
>     >      runif(1)
>     >      # [1] 0.2875775
>     >      
>     >      set.seed(123)
>     >      suppressWarnings(suppressMessages(require(Homo.sapiens)))
>     >      runif(1)
>     >      # [1] 0.7883051
>     >      
>     >      set.seed(123)
>     >      runif(1)
>     >      # [1] 0.2875775
>     >      
>     >      Is that known or expected behaviour?
>     >      Should I not load a package inside a function that later uses random numbers?
>     >      
>     >      Thanks in advance,
>     >      Steffi
>     >      
>     >      _______________________________________________
>     >      Bioc-devel using r-project.org mailing list
>     >      https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8XXamcpEeef966i7IGk_3aE9GMJodKAzXwWW4fL_hrI&s=KoHGLM0HbP4whRZLG4ol66_q1qkg9E0LjFHObDqgNuo&e=
>     >      
>     
>     -- 
>     Hervé Pagès
>     
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>     
>     E-mail: hpages using fredhutch.org
>     Phone:  (206) 667-5791
>     Fax:    (206) 667-1319
>     
>



More information about the Bioc-devel mailing list