I'd like to pick up this thread started on 2019-04-11 (https://hypatia.math.ethz.ch/pipermail/r-devel/2019-April/077632.html). Modulo all the other suggestions in this thread, would my proposal of being able to disable forked processing via an option or an environment variable make sense? I've prototyped a working patch that works like:
> options(fork.allowed = FALSE) > unlist(parallel::mclapply(1:2, FUN = function(x) Sys.getpid())) [1] 14058 14058 > parallel::mcmapply(1:2, FUN = function(x) Sys.getpid()) [1] 14058 14058 > parallel::pvec(1:2, FUN = function(x) Sys.getpid() + x/10) [1] 14058.1 14058.2 > f <- parallel::mcparallel(Sys.getpid()) Error in allowFork(assert = TRUE) : Forked processing is not allowed per option ‘fork.allowed’ or environment variable ‘R_FORK_ALLOWED’ > cl <- parallel::makeForkCluster(1L) Error in allowFork(assert = TRUE) : Forked processing is not allowed per option ‘fork.allowed’ or environment variable ‘R_FORK_ALLOWED’ > The patch is: Index: src/library/parallel/R/unix/forkCluster.R =================================================================== --- src/library/parallel/R/unix/forkCluster.R (revision 77648) +++ src/library/parallel/R/unix/forkCluster.R (working copy) @@ -30,6 +30,7 @@ newForkNode <- function(..., options = defaultClusterOptions, rank) { + allowFork(assert = TRUE) options <- addClusterOptions(options, list(...)) outfile <- getClusterOption("outfile", options) port <- getClusterOption("port", options) Index: src/library/parallel/R/unix/mclapply.R =================================================================== --- src/library/parallel/R/unix/mclapply.R (revision 77648) +++ src/library/parallel/R/unix/mclapply.R (working copy) @@ -28,7 +28,7 @@ stop("'mc.cores' must be >= 1") .check_ncores(cores) - if (isChild() && !isTRUE(mc.allow.recursive)) + if (!allowFork() || (isChild() && !isTRUE(mc.allow.recursive))) return(lapply(X = X, FUN = FUN, ...)) ## Follow lapply Index: src/library/parallel/R/unix/mcparallel.R =================================================================== --- src/library/parallel/R/unix/mcparallel.R (revision 77648) +++ src/library/parallel/R/unix/mcparallel.R (working copy) @@ -20,6 +20,7 @@ mcparallel <- function(expr, name, mc.set.seed = TRUE, silent = FALSE, mc.affinity = NULL, mc.interactive = FALSE, detached = FALSE) { + allowFork(assert = TRUE) f <- mcfork(detached) env <- parent.frame() if (isTRUE(mc.set.seed)) mc.advance.stream() Index: src/library/parallel/R/unix/pvec.R =================================================================== --- src/library/parallel/R/unix/pvec.R (revision 77648) +++ src/library/parallel/R/unix/pvec.R (working copy) @@ -25,7 +25,7 @@ cores <- as.integer(mc.cores) if(cores < 1L) stop("'mc.cores' must be >= 1") - if(cores == 1L) return(FUN(v, ...)) + if(cores == 1L || !allowFork()) return(FUN(v, ...)) .check_ncores(cores) if(mc.set.seed) mc.reset.stream() with a new file src/library/parallel/R/unix/allowFork.R: allowFork <- function(assert = FALSE) { value <- Sys.getenv("R_FORK_ALLOWED") if (nzchar(value)) { value <- switch(value, "1"=, "TRUE"=, "true"=, "True"=, "yes"=, "Yes"= TRUE, "0"=, "FALSE"=,"false"=,"False"=, "no"=, "No" = FALSE, stop(gettextf("invalid environment variable value: %s==%s", "R_FORK_ALLOWED", value))) value <- as.logical(value) } else { value <- TRUE } value <- getOption("fork.allowed", value) if (is.na(value)) { stop(gettextf("invalid option value: %s==%s", "fork.allowed", value)) } if (assert && !value) { stop(gettextf("Forked processing is not allowed per option %s or environment variable %s", sQuote("fork.allowed"), sQuote("R_FORK_ALLOWED"))) } value } /Henrik On Mon, Apr 15, 2019 at 3:12 AM Tomas Kalibera <tomas.kalib...@gmail.com> wrote: > > On 4/15/19 11:02 AM, Iñaki Ucar wrote: > > On Mon, 15 Apr 2019 at 08:44, Tomas Kalibera <tomas.kalib...@gmail.com> > > wrote: > >> On 4/13/19 12:05 PM, Iñaki Ucar wrote: > >>> On Sat, 13 Apr 2019 at 03:51, Kevin Ushey <kevinus...@gmail.com> wrote: > >>>> I think it's worth saying that mclapply() works as documented > >>> Mostly, yes. But it says nothing about fork's copy-on-write and memory > >>> overcommitment, and that this means that it may work nicely or fail > >>> spectacularly depending on whether, e.g., you operate on a long > >>> vector. > >> R cannot possibly replicate documentation of the underlying operating > >> systems. It clearly says that fork() is used and readers who may not > >> know what fork() is need to learn it from external sources. > >> Copy-on-write is an elementary property of fork(). > > Just to be precise, copy-on-write is an optimization widely deployed > > in most modern *nixes, particularly for the architectures in which R > > usually runs. But it is not an elementary property; it is not even > > possible without an MMU. > > Yes, old Unix systems without virtual memory had fork eagerly copying. > Not relevant today, and certainly not for systems that run R, but indeed > people interested in OS internals can look elsewhere for more precise > information. > > Tomas > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel