[Rd] portableParalleSeeds Package violation, CRAN exception?
Paul Johnson
pauljohn32 at gmail.com
Wed Aug 6 20:10:32 CEST 2014
I'm writing to ask for a policy exception, or advice on how to make
this package CRAN allowable.
http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz
Yesterday I tried to submit a package on CRAN and Dr Ripley pointed
out that I had not understood the instructions about packages. Here's
the part where the R check gives a Note
* checking R code for possible problems ... NOTE
Found the following assignments to the global environment:
File ‘portableParallelSeeds/R/initPortableStreams.R’:
assign("currentStream", n, envir = .GlobalEnv)
assign("currentStates", curStates, envir = .GlobalEnv)
assign("currentStream", 1L, envir = .GlobalEnv)
assign("startStates", runSeeds, envir = .GlobalEnv)
assign("currentStates", runSeeds, envir = .GlobalEnv)
assign("currentStream", as.integer(currentStream), envir = .GlobalEnv)
assign("startStates", runSeeds, envir = .GlobalEnv)
assign("currentStates", runSeeds, envir = .GlobalEnv)
Altering the user's environment requires a special arrangement with
CRAN. I believe this is justified, I'll sketch the reasons now. But,
mostly, I'm at your mercy and if there is any way to make this
possible, I would be very grateful.
To control & replace random number streams, it really is necessary to
alter the workspace. That's where the random generator state is
stored. It is acknowledged in Robert Gentleman' s Book, R Programming
for Bionformatics "The decision to have these [random generator]
functions manipulate a global variable, .Random.seed, is slightly
unfortunate as it makes it somewhat more difficult to manage several
different random number streams simultaneously” (Gentleman, 2009, p.
201).
I have developed an understandable set of wrapper functions that handle this.
Some of you may recall this project. I've asked about it here a couple
of times. We allow separate streams of randoms for different purposes
within a single R run. There is a framework to save 1000s of those
sets in a file, so it can be used on a cluster or in a single
workstation. This is handy because, when 1 run in 10,000 on the
cluster exhibits some weird behavior, we can easily re-initiate that
interactively and see what's going on.
I have a vignette "pps" that explains. I dropped a copy of that here
in case you don't want to get the package:
http://pj.freefaculty.org/scraps/pps.pdf
While working on that, I gained a considerably deeper understanding of
random generators and seeds. That is what this vignette is about
http://pj.freefaculty.org/scraps/PRNG-basics.pdf
We've been running simulations on our cluster with the
portableParallelSeeds framework for 2 years, we've never had any
trouble. We are able to re-start runs, verify random number draws in
separate streams.
PJ
--
Paul E. Johnson
Professor, Political Science Assoc. Director
1541 Lilac Lane, Room 504 Center for Research Methods
University of Kansas University of Kansas
http://pj.freefaculty.org http://quant.ku.edu
More information about the R-devel
mailing list