[R-pkg-devel] Save and restoring random number seed in a package function

Noah Greifer no@h@gre||er @end|ng |rom gm@||@com
Wed Sep 14 22:31:24 CEST 2022


Hi Henrik,

Thank you so much, this is extremely helpful! Glad to know my approach is
not unreasonable. Hopefully CRAN will be okay with these modifications.

Noah

On Wed, Sep 14, 2022 at 2:16 PM Henrik Bengtsson <henrik.bengtsson using gmail.com>
wrote:

> Great to see you're handling the corner case when .Random.seed is not
> set.  Note that your current implementation of boot.array() still
> risks leaving .Random.seed modified in case an error or an interrupt
> occurs while evaluating boot.array().  If that happens, one certainly
> can argue that it modifies the .GlobalEnv environment.  A more robust
> solution is:
>
> boot.array <- function(boot.out) {
>   genv <- globalenv()
>
>   # Make sure to leave '.Random.seed' as-is on exit
>   old_seed <- genv$.Random.seed
>   on.exit(suspendInterrupts({
>     if (is.null(old_seed)) {
>       rm(".Random.seed", envir = genv, inherits = FALSE)
>     } else {
>       assign(".Random.seed", value = old_seed, envir = genv, inherits =
> FALSE)
>     }
>   }))
>
>   # Assign saved seed from boot.out
>   assign(".Random.seed", value = boot.out$seed, envir = genv)
>
>   # Generate same random numbers from boot() call
> }
>
> This implementation leaves .GlobalEnv unmodified when the function
> exits, regardless of how.
>
> If CRAN would not allow temporarily modifying the .Random.seed this
> way, there would not be possible to distribute a lot of statistical
> algorithm on CRAN. I can't speak for CRAN, but I would assume they
> accept this type of approach, since it should have no side effects.
> FWIW, this strategy is used by several CRAN packages for generating
> random numbers in "stealth mode" while R's RNG, e.g. 'future.apply'
> (mine) and 'withr'. If you have a GitHub account, you find hundres of
> other CRAN examples by searching "org:cran assign .Random.seed".  So,
> you're not alone in this need.  As a fallback, you could use
> `withr::with_seed()` which uses the above approach. That would put the
> burden on the 'withr' maintainer to argue for this approach.
>
> Hope this helps,
>
> Henrik
>
>
> On Wed, Sep 14, 2022 at 8:11 AM Noah Greifer <noah.greifer using gmail.com>
> wrote:
> >
> > Yes, set.seed() cannot accept .Random.Seed as an input; it can only take
> a
> > single integer. As said in this answer
> > <https://stackoverflow.com/a/13997608/6348551>, there is a one-way
> > relationship between set.seed() and .Random.Seed. My understanding is
> that
> > the recommended way to restore the seed is to assign the saved seed to
> > .Random.Seed in the global environment, though this is the method that is
> > not allowed by the CRAN policy. Unfortunately saving it in the
> environment
> > of the inner function is not sufficient.
> >
> > One potential inconsistency with CRAN's policy is that generating a
> random
> > number itself changes the global environment by changing the value of
> > .Random.Seed. The boot.array() code just does it manually using assign().
> > Indeed, the boot.array() code does less damage to the global environment
> in
> > that it resets the seed to what it would have been had boot.array() not
> > been run.
> >
> > Noah
> >
> > On Wed, Sep 14, 2022 at 10:39 AM James Pustejovsky <jepusto using gmail.com>
> > wrote:
> >
> > > I'm interested in this question too. Noah, is there a reason you are
> using
> > > assign(".Random.seed",...) rather than set.seed()?
> > >
> > > On Wed, Sep 14, 2022 at 9:31 AM Noah Greifer <noah.greifer using gmail.com>
> > > wrote:
> > >
> > >> Hello fellow developers,
> > >>
> > >> I am attempting to solve the problem of saving the state of the random
> > >> generator so that the state can be recovered in a future call.
> > >> Essentially,
> > >> my function generates random numbers, performs an operation on them
> > >> (saving
> > >> the result), and throws them out (saving them would require too much
> > >> memory). A second function is meant to take the output of the first
> > >> function, generate the same random numbers, and perform a different
> > >> operation on them.
> > >>
> > >> This is exactly what happens in the *boot* package: the boot()
> function
> > >> saves the random seed (extracted from .Random.Seed), and the
> boot.array()
> > >> function extracts the saved seed from the boot() output, sets the
> seed to
> > >> that value, re-generates the same set of random numbers, and then
> > >> re-sets the seed to what it was before boot.array() was called. This
> has
> > >> the following benefits: 1) it allows the same random numbers to be
> drawn;
> > >> 2) the random numbers don't need to be saved, which is good because
> they
> > >> would take up a lot of memory and boot.array() is an optional
> function (it
> > >> is used in boot.ci() with type = "bca" for those curious); and 3) the
> > >> seed
> > >> carries on from where it left off before boot.array() was called
> instead
> > >> of
> > >> being set to what it was after boot() was called.
> > >>
> > >> This is implemented in boot in the following way (code abbreviated):
> > >>
> > >> boot <- function(...) {
> > >>   seed <- .Random.Seed
> > >>   #Random numbers generated
> > >>   out <- list(seed = seed
> > >>                   #Other stuff is in this list
> > >>               )
> > >>   out
> > >> }
> > >>
> > >> boot.array <- function(boot.out) {
> > >>   #Save current random seed in `temp`
> > >>   if (exists(".Random.seed", envir = .GlobalEnv, inherits = FALSE))
> > >>     temp <- get(".Random.seed", envir = .GlobalEnv, inherits = FALSE)
> > >>   else temp <- NULL
> > >>
> > >>   #Assign saved seed from boot.out
> > >>   assign(".Random.seed", boot.out$seed, envir = .GlobalEnv)
> > >>
> > >>   #Generate same random numbers from boot() call
> > >>
> > >>   #Restore random seed to what it was before boot.array() call
> > >>   if (!is.null(temp))
> > >>     assign(".Random.seed", temp, envir = .GlobalEnv)
> > >>   else rm(.Random.seed, pos = 1)
> > >> }
> > >>
> > >> This seems to work as intended. However, this violates the CRAN
> policy of
> > >> changing the global environment. When I used this exact code in a
> package
> > >> I
> > >> submitted, the package was rejected for it. The message I received was
> > >>
> > >> > Please do not modify the .GlobalEnv (e.g.: by changing the
> .Random.seed
> > >> > which is part of the global environment). This is not allowed by the
> > >> CRAN
> > >> > policies.
> > >>
> > >>
> > >> I'm curious what you think the best course of action might be, and
> what
> > >> the
> > >> current policy means for the *boot* package. Thanks for your help.
> > >>
> > >> Noah
> > >>
> > >>         [[alternative HTML version deleted]]
> > >>
> > >> ______________________________________________
> > >> R-package-devel using r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > >>
> > >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-package-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list