[R-pkg-devel] Save and restoring random number seed in a package function
Henrik Bengtsson
henr|k@bengt@@on @end|ng |rom gm@||@com
Wed Sep 14 20:16:40 CEST 2022
Great to see you're handling the corner case when .Random.seed is not
set. Note that your current implementation of boot.array() still
risks leaving .Random.seed modified in case an error or an interrupt
occurs while evaluating boot.array(). If that happens, one certainly
can argue that it modifies the .GlobalEnv environment. A more robust
solution is:
boot.array <- function(boot.out) {
genv <- globalenv()
# Make sure to leave '.Random.seed' as-is on exit
old_seed <- genv$.Random.seed
on.exit(suspendInterrupts({
if (is.null(old_seed)) {
rm(".Random.seed", envir = genv, inherits = FALSE)
} else {
assign(".Random.seed", value = old_seed, envir = genv, inherits = FALSE)
}
}))
# Assign saved seed from boot.out
assign(".Random.seed", value = boot.out$seed, envir = genv)
# Generate same random numbers from boot() call
}
This implementation leaves .GlobalEnv unmodified when the function
exits, regardless of how.
If CRAN would not allow temporarily modifying the .Random.seed this
way, there would not be possible to distribute a lot of statistical
algorithm on CRAN. I can't speak for CRAN, but I would assume they
accept this type of approach, since it should have no side effects.
FWIW, this strategy is used by several CRAN packages for generating
random numbers in "stealth mode" while R's RNG, e.g. 'future.apply'
(mine) and 'withr'. If you have a GitHub account, you find hundres of
other CRAN examples by searching "org:cran assign .Random.seed". So,
you're not alone in this need. As a fallback, you could use
`withr::with_seed()` which uses the above approach. That would put the
burden on the 'withr' maintainer to argue for this approach.
Hope this helps,
Henrik
On Wed, Sep 14, 2022 at 8:11 AM Noah Greifer <noah.greifer using gmail.com> wrote:
>
> Yes, set.seed() cannot accept .Random.Seed as an input; it can only take a
> single integer. As said in this answer
> <https://stackoverflow.com/a/13997608/6348551>, there is a one-way
> relationship between set.seed() and .Random.Seed. My understanding is that
> the recommended way to restore the seed is to assign the saved seed to
> .Random.Seed in the global environment, though this is the method that is
> not allowed by the CRAN policy. Unfortunately saving it in the environment
> of the inner function is not sufficient.
>
> One potential inconsistency with CRAN's policy is that generating a random
> number itself changes the global environment by changing the value of
> .Random.Seed. The boot.array() code just does it manually using assign().
> Indeed, the boot.array() code does less damage to the global environment in
> that it resets the seed to what it would have been had boot.array() not
> been run.
>
> Noah
>
> On Wed, Sep 14, 2022 at 10:39 AM James Pustejovsky <jepusto using gmail.com>
> wrote:
>
> > I'm interested in this question too. Noah, is there a reason you are using
> > assign(".Random.seed",...) rather than set.seed()?
> >
> > On Wed, Sep 14, 2022 at 9:31 AM Noah Greifer <noah.greifer using gmail.com>
> > wrote:
> >
> >> Hello fellow developers,
> >>
> >> I am attempting to solve the problem of saving the state of the random
> >> generator so that the state can be recovered in a future call.
> >> Essentially,
> >> my function generates random numbers, performs an operation on them
> >> (saving
> >> the result), and throws them out (saving them would require too much
> >> memory). A second function is meant to take the output of the first
> >> function, generate the same random numbers, and perform a different
> >> operation on them.
> >>
> >> This is exactly what happens in the *boot* package: the boot() function
> >> saves the random seed (extracted from .Random.Seed), and the boot.array()
> >> function extracts the saved seed from the boot() output, sets the seed to
> >> that value, re-generates the same set of random numbers, and then
> >> re-sets the seed to what it was before boot.array() was called. This has
> >> the following benefits: 1) it allows the same random numbers to be drawn;
> >> 2) the random numbers don't need to be saved, which is good because they
> >> would take up a lot of memory and boot.array() is an optional function (it
> >> is used in boot.ci() with type = "bca" for those curious); and 3) the
> >> seed
> >> carries on from where it left off before boot.array() was called instead
> >> of
> >> being set to what it was after boot() was called.
> >>
> >> This is implemented in boot in the following way (code abbreviated):
> >>
> >> boot <- function(...) {
> >> seed <- .Random.Seed
> >> #Random numbers generated
> >> out <- list(seed = seed
> >> #Other stuff is in this list
> >> )
> >> out
> >> }
> >>
> >> boot.array <- function(boot.out) {
> >> #Save current random seed in `temp`
> >> if (exists(".Random.seed", envir = .GlobalEnv, inherits = FALSE))
> >> temp <- get(".Random.seed", envir = .GlobalEnv, inherits = FALSE)
> >> else temp <- NULL
> >>
> >> #Assign saved seed from boot.out
> >> assign(".Random.seed", boot.out$seed, envir = .GlobalEnv)
> >>
> >> #Generate same random numbers from boot() call
> >>
> >> #Restore random seed to what it was before boot.array() call
> >> if (!is.null(temp))
> >> assign(".Random.seed", temp, envir = .GlobalEnv)
> >> else rm(.Random.seed, pos = 1)
> >> }
> >>
> >> This seems to work as intended. However, this violates the CRAN policy of
> >> changing the global environment. When I used this exact code in a package
> >> I
> >> submitted, the package was rejected for it. The message I received was
> >>
> >> > Please do not modify the .GlobalEnv (e.g.: by changing the .Random.seed
> >> > which is part of the global environment). This is not allowed by the
> >> CRAN
> >> > policies.
> >>
> >>
> >> I'm curious what you think the best course of action might be, and what
> >> the
> >> current policy means for the *boot* package. Thanks for your help.
> >>
> >> Noah
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-package-devel using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >>
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
More information about the R-package-devel
mailing list