[R] Limiting the scope of RNGkind/set.seed

Paul Gilbert pg||bert902 @end|ng |rom gm@||@com
Wed Apr 17 19:38:17 CEST 2019


Elizabeth

There is a package (of mine) setRNG on CRAN that may be a helpful 
example (code/tests/examples/vignette). Most of the package is testing 
designed to fail if the RNG in R is changed in a way that will affect my 
other package testing. Martin's function in the previous reply has most 
of the import parts and adds warning suppression, so you might want to 
consider small adjustments on a combination of the two. Just to 
summarize the issues, from memory:

0/ Using a preset default seed in a function's argument makes the 
function not random by default. If you are doing that then maybe you 
need to consider carefully whether the function should be using random 
number generation.

1/ It is good practice to use on.exit() in your function to reset 
things, so the state remains unaltered if your function fails.

2/ Saving the old seed does not work when it is unset, as it is by 
default in a new session, so you need to do something that insures it is 
set.

3/ You may need to save and reset not only the seed but also the RNG 
kind and the normal.kind, and possibly the kind for some other 
distributions. (setRNG does not handle other distributions.) It looks 
like you need to save and reset sample.kind.

4/ You should add the capability to pass all settings to your functions 
so that you can reproduce things when you want.

5/ I have found it useful to always pass back the settings in objects 
returned by functions like simulations. That way you always have a 
record when you discover something you want to reproduce.

6/ If parallel computing is considered then for reproducibility you need 
to save the number of nodes in the cluster. (I think this point is not 
as widely known as it should be.)

No doubt I have forgotten a few things.

Paul Gilbert

On 4/17/19 6:00 AM, r-help-request using r-project.org wrote:
 > Date: Tue, 16 Apr 2019 19:22:34 +0200
 > From: Martin Maechler<maechler using stat.math.ethz.ch>
 > To: Elizabeth Purdom<epurdom using stat.berkeley.edu>
 > Cc: Bert Gunter<bgunter.4567 using gmail.com>, R-help
 >     <r-help using r-project.org>
 > Subject: Re: [R] Limiting the scope of RNGkind/set.seed
 > Message-ID:<23734.3930.10744.126501 using stat.math.ethz.ch>
 > Content-Type: text/plain; charset="utf-8"
 >
 >>>>>> Elizabeth Purdom
 >>>>>>      on Tue, 16 Apr 2019 09:45:45 -0700 writes:
 >      > Hi Bert, Thanks for your response. What you suggest is
 >      > more or less the fix I suggested in my email (my second
 >      > version of .rcolors). I writing more because I was
 >      > wondering if there was a better way to work with RNG that
 >      > would avoid doing that. It doesn’t feel very friendly for
 >      > my package to be making changes to the user’s global
 >      > environment, even though I am setting them back (and if it
 >      > weren’t for the fact that setting the new R 3.6 argument
 >      > `sample.kind=“Rounding”` creates a warning, I wouldn’t
 >      > have even realized I was affecting the user’s settings, so
 >      > it seems potentially hazardous that packages could be
 >      > changing users settings without them being aware of
 >      > it). So I was wondering if there was a way to more fully
 >      > isolate the command.  Thanks, Elizabeth
 >
 > Hi Elizabeth,
 >
 > there's actually something better -- I think -- that you can do:
 >
 > You store .Random.seed  before doing an RNGkind() & set.seed()
 > setting, do all that, and make sure that .Random.seed is
 > restored when leaving your function.
 >
 > This works because the (typically quite long) .Random.seed
 > stores the full state of the RNG, i.e., all RNGkind() settings
 > *and*  the result of set.seed() , calling r<foo>(n, ..)  etc.
 >
 > If you additionally use  on.exit()  instead of manually reset
 > things, you have the additional advantage, that things are also
 > reset when your functions ends because the user interrupts its
 > computations, or an error happens, etc.
 >
 > So, your function would more elegantly (and robustly!)  look like
 >
 > .rcolors <- function(seed = 23589) {
 >      if(!exists(".Random.seed", envir = .GlobalEnv)) {
 >          message("calling runif(1)"); runif(1) }
 >      old.R.s <- .Random.seed
 >      ## will reset everything on exiting this function:
 >      on.exit(assign(".Random.seed", old.R.s, envir=.GlobalEnv))
 >      ## set seed for sample() "back compatibly":
 >      suppressWarnings(RNGversion("3.5.0"))
 >      set.seed(seed)
 >      ## return random permutation of "my colors"
 >      sample(colors()[-c(152:361)])
 > }
 >
 > BTW, you can look at  simulate() methods in standard R, e.g.,
 >
 >    stats:::simulate.lm
 >
 > to see the same method use [optionally, with slightly more 
sophistication]
 >
 >
 > Best,
 > Martin
 >
 > Martin Mächler
 > ETH Zurich, Switzerland



More information about the R-help mailing list