[Rd] SUGGESTION: Environment variable R_MAX_MC_CORES for maximum number of cores
Henrik Bengtsson
hb at biostat.ucsf.edu
Mon Nov 11 13:31:38 CET 2013
I like to propose a unified/standard system environment variable that
specifies the maximum number of cores an R session should use, e.g.
R_MAX_MC_CORES. This could then be used to *guide* multicore
implementations on the number of cores to use. This is different from
parallel::detectCores().
ENVIRONMENT VARIABLE:
library(parallel)
mc.cores <- as.integer(Sys.getenv("R_MAX_MC_CORES", 1L))
res <- mclapply(1:10, FUN=fib, mc.cores=mc.cores)
R OPTION:
Analogously to several other env.var./options, R_MAX_MC_CORES could
set an option on startup for convenience, e.g.
options(max.mc.cores=as.integer(Sys.getenv("R_MAX_MC_CORES", 1L)))
R COMMAND-LINE OPTION:
One could also imagine a command-line option for R/Rscript that sets this, e.g.
Rscript --max.mc.cores=3 batch.R
EXAMPLE OF USAGE:
This would for instance simplify multicore processing on PBS cluster,
where the PBS job script can be:
Rscript --max.mc.cores=$PBS_NUM_PPN batch.R
such that R and the 'batch.R' script does not have to be aware of
settings/variables specific to PBS (or whatever cluster system is
used).
Finally, getOption("max.mc.cores", 1L) could possibly also be the new
default for the 'mc.cores' argument in 'parallel' functions.
/Henrik
More information about the R-devel
mailing list