[Rd] SUGGESTION: Environment variable R_MAX_MC_CORES for maximum number of cores

Henrik Bengtsson hb at biostat.ucsf.edu
Mon Nov 11 13:31:38 CET 2013


I like to propose a unified/standard system environment variable that
specifies the maximum number of cores an R session should use, e.g.
R_MAX_MC_CORES.  This could then be used to *guide* multicore
implementations on the number of cores to use.  This is different from
parallel::detectCores().

ENVIRONMENT VARIABLE:
library(parallel)
mc.cores <- as.integer(Sys.getenv("R_MAX_MC_CORES", 1L))
res <- mclapply(1:10, FUN=fib, mc.cores=mc.cores)

R OPTION:
Analogously to several other env.var./options, R_MAX_MC_CORES could
set an option on startup for convenience, e.g.
options(max.mc.cores=as.integer(Sys.getenv("R_MAX_MC_CORES", 1L)))

R COMMAND-LINE OPTION:
One could also imagine a command-line option for R/Rscript that sets this, e.g.
Rscript --max.mc.cores=3 batch.R

EXAMPLE OF USAGE:
This would for instance simplify multicore processing on PBS cluster,
where the PBS job script can be:

Rscript --max.mc.cores=$PBS_NUM_PPN batch.R

such that R and the 'batch.R' script does not have to be aware of
settings/variables specific to PBS (or whatever cluster system is
used).

Finally, getOption("max.mc.cores", 1L) could possibly also be the new
default for the 'mc.cores' argument in 'parallel' functions.


/Henrik



More information about the R-devel mailing list