[Rd] SUGGESTION: Settings to disable forked processing in R, e.g. parallel::mclapply()
Henrik Bengtsson
henr|k@bengt@@on @end|ng |rom gm@||@com
Fri Jan 10 07:33:51 CET 2020
I'd like to pick up this thread started on 2019-04-11
(https://hypatia.math.ethz.ch/pipermail/r-devel/2019-April/077632.html).
Modulo all the other suggestions in this thread, would my proposal of
being able to disable forked processing via an option or an
environment variable make sense? I've prototyped a working patch that
works like:
> options(fork.allowed = FALSE)
> unlist(parallel::mclapply(1:2, FUN = function(x) Sys.getpid()))
[1] 14058 14058
> parallel::mcmapply(1:2, FUN = function(x) Sys.getpid())
[1] 14058 14058
> parallel::pvec(1:2, FUN = function(x) Sys.getpid() + x/10)
[1] 14058.1 14058.2
> f <- parallel::mcparallel(Sys.getpid())
Error in allowFork(assert = TRUE) :
Forked processing is not allowed per option ‘fork.allowed’ or
environment variable ‘R_FORK_ALLOWED’
> cl <- parallel::makeForkCluster(1L)
Error in allowFork(assert = TRUE) :
Forked processing is not allowed per option ‘fork.allowed’ or
environment variable ‘R_FORK_ALLOWED’
>
The patch is:
Index: src/library/parallel/R/unix/forkCluster.R
===================================================================
--- src/library/parallel/R/unix/forkCluster.R (revision 77648)
+++ src/library/parallel/R/unix/forkCluster.R (working copy)
@@ -30,6 +30,7 @@
newForkNode <- function(..., options = defaultClusterOptions, rank)
{
+ allowFork(assert = TRUE)
options <- addClusterOptions(options, list(...))
outfile <- getClusterOption("outfile", options)
port <- getClusterOption("port", options)
Index: src/library/parallel/R/unix/mclapply.R
===================================================================
--- src/library/parallel/R/unix/mclapply.R (revision 77648)
+++ src/library/parallel/R/unix/mclapply.R (working copy)
@@ -28,7 +28,7 @@
stop("'mc.cores' must be >= 1")
.check_ncores(cores)
- if (isChild() && !isTRUE(mc.allow.recursive))
+ if (!allowFork() || (isChild() && !isTRUE(mc.allow.recursive)))
return(lapply(X = X, FUN = FUN, ...))
## Follow lapply
Index: src/library/parallel/R/unix/mcparallel.R
===================================================================
--- src/library/parallel/R/unix/mcparallel.R (revision 77648)
+++ src/library/parallel/R/unix/mcparallel.R (working copy)
@@ -20,6 +20,7 @@
mcparallel <- function(expr, name, mc.set.seed = TRUE, silent =
FALSE, mc.affinity = NULL, mc.interactive = FALSE, detached = FALSE)
{
+ allowFork(assert = TRUE)
f <- mcfork(detached)
env <- parent.frame()
if (isTRUE(mc.set.seed)) mc.advance.stream()
Index: src/library/parallel/R/unix/pvec.R
===================================================================
--- src/library/parallel/R/unix/pvec.R (revision 77648)
+++ src/library/parallel/R/unix/pvec.R (working copy)
@@ -25,7 +25,7 @@
cores <- as.integer(mc.cores)
if(cores < 1L) stop("'mc.cores' must be >= 1")
- if(cores == 1L) return(FUN(v, ...))
+ if(cores == 1L || !allowFork()) return(FUN(v, ...))
.check_ncores(cores)
if(mc.set.seed) mc.reset.stream()
with a new file src/library/parallel/R/unix/allowFork.R:
allowFork <- function(assert = FALSE) {
value <- Sys.getenv("R_FORK_ALLOWED")
if (nzchar(value)) {
value <- switch(value,
"1"=, "TRUE"=, "true"=, "True"=, "yes"=, "Yes"= TRUE,
"0"=, "FALSE"=,"false"=,"False"=, "no"=, "No" = FALSE,
stop(gettextf("invalid environment variable value: %s==%s",
"R_FORK_ALLOWED", value)))
value <- as.logical(value)
} else {
value <- TRUE
}
value <- getOption("fork.allowed", value)
if (is.na(value)) {
stop(gettextf("invalid option value: %s==%s", "fork.allowed", value))
}
if (assert && !value) {
stop(gettextf("Forked processing is not allowed per option %s or
environment variable %s", sQuote("fork.allowed"),
sQuote("R_FORK_ALLOWED")))
}
value
}
/Henrik
On Mon, Apr 15, 2019 at 3:12 AM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>
> On 4/15/19 11:02 AM, Iñaki Ucar wrote:
> > On Mon, 15 Apr 2019 at 08:44, Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
> >> On 4/13/19 12:05 PM, Iñaki Ucar wrote:
> >>> On Sat, 13 Apr 2019 at 03:51, Kevin Ushey <kevinushey using gmail.com> wrote:
> >>>> I think it's worth saying that mclapply() works as documented
> >>> Mostly, yes. But it says nothing about fork's copy-on-write and memory
> >>> overcommitment, and that this means that it may work nicely or fail
> >>> spectacularly depending on whether, e.g., you operate on a long
> >>> vector.
> >> R cannot possibly replicate documentation of the underlying operating
> >> systems. It clearly says that fork() is used and readers who may not
> >> know what fork() is need to learn it from external sources.
> >> Copy-on-write is an elementary property of fork().
> > Just to be precise, copy-on-write is an optimization widely deployed
> > in most modern *nixes, particularly for the architectures in which R
> > usually runs. But it is not an elementary property; it is not even
> > possible without an MMU.
>
> Yes, old Unix systems without virtual memory had fork eagerly copying.
> Not relevant today, and certainly not for systems that run R, but indeed
> people interested in OS internals can look elsewhere for more precise
> information.
>
> Tomas
>
More information about the R-devel
mailing list