If I understand the thread correctly this is an RStudio issue and I would 
suggest that the developers consider using pthread_atfork() so RStudio can 
handle forking as they deem fit (bail out with an error or make RStudio work).  
Note that in principle the functionality requested here can be easily 
implemented in a package so R doesn’t need to be modified.

Cheers,
Simon

Sent from my iPhone

>> On Jan 10, 2020, at 04:34, Tomas Kalibera <tomas.kalib...@gmail.com> wrote:
>> 
>> On 1/10/20 7:33 AM, Henrik Bengtsson wrote:
>> I'd like to pick up this thread started on 2019-04-11
>> (https://hypatia.math.ethz.ch/pipermail/r-devel/2019-April/077632.html).
>> Modulo all the other suggestions in this thread, would my proposal of
>> being able to disable forked processing via an option or an
>> environment variable make sense?
> 
> I don't think R should be doing that. There are caveats with using fork, and 
> they are mentioned in the documentation of the parallel package, so people 
> can easily avoid functions that use it, and this all has been discussed here 
> recently.
> 
> If it is the case, we can expand the documentation in parallel package, add a 
> warning against the use of forking with RStudio, but for that I it would be 
> good to know at least why it is not working. From the github issue I have the 
> impression that it is not really known why, whether it could be fixed, and if 
> so, where. The same github issue reflects also that some people want to use 
> forking for performance reasons, and even with RStudio, at least on Linux. 
> Perhaps it could be fixed? Perhaps it is just some race condition somewhere?
> 
> Tomas
> 
>> I've prototyped a working patch that
>> works like:
>>> options(fork.allowed = FALSE)
>>> unlist(parallel::mclapply(1:2, FUN = function(x) Sys.getpid()))
>> [1] 14058 14058
>>> parallel::mcmapply(1:2, FUN = function(x) Sys.getpid())
>> [1] 14058 14058
>>> parallel::pvec(1:2, FUN = function(x) Sys.getpid() + x/10)
>> [1] 14058.1 14058.2
>>> f <- parallel::mcparallel(Sys.getpid())
>> Error in allowFork(assert = TRUE) :
>>  Forked processing is not allowed per option ‘fork.allowed’ or
>> environment variable ‘R_FORK_ALLOWED’
>>> cl <- parallel::makeForkCluster(1L)
>> Error in allowFork(assert = TRUE) :
>>  Forked processing is not allowed per option ‘fork.allowed’ or
>> environment variable ‘R_FORK_ALLOWED’
>> The patch is:
>> Index: src/library/parallel/R/unix/forkCluster.R
>> ===================================================================
>> --- src/library/parallel/R/unix/forkCluster.R (revision 77648)
>> +++ src/library/parallel/R/unix/forkCluster.R (working copy)
>> @@ -30,6 +30,7 @@
>> newForkNode <- function(..., options = defaultClusterOptions, rank)
>> {
>> +    allowFork(assert = TRUE)
>>     options <- addClusterOptions(options, list(...))
>>     outfile <- getClusterOption("outfile", options)
>>     port <- getClusterOption("port", options)
>> Index: src/library/parallel/R/unix/mclapply.R
>> ===================================================================
>> --- src/library/parallel/R/unix/mclapply.R (revision 77648)
>> +++ src/library/parallel/R/unix/mclapply.R (working copy)
>> @@ -28,7 +28,7 @@
>>         stop("'mc.cores' must be >= 1")
>>     .check_ncores(cores)
>> -    if (isChild() && !isTRUE(mc.allow.recursive))
>> +    if (!allowFork() || (isChild() && !isTRUE(mc.allow.recursive)))
>>         return(lapply(X = X, FUN = FUN, ...))
>>     ## Follow lapply
>> Index: src/library/parallel/R/unix/mcparallel.R
>> ===================================================================
>> --- src/library/parallel/R/unix/mcparallel.R (revision 77648)
>> +++ src/library/parallel/R/unix/mcparallel.R (working copy)
>> @@ -20,6 +20,7 @@
>> mcparallel <- function(expr, name, mc.set.seed = TRUE, silent =
>> FALSE, mc.affinity = NULL, mc.interactive = FALSE, detached = FALSE)
>> {
>> +    allowFork(assert = TRUE)
>>     f <- mcfork(detached)
>>     env <- parent.frame()
>>     if (isTRUE(mc.set.seed)) mc.advance.stream()
>> Index: src/library/parallel/R/unix/pvec.R
>> ===================================================================
>> --- src/library/parallel/R/unix/pvec.R (revision 77648)
>> +++ src/library/parallel/R/unix/pvec.R (working copy)
>> @@ -25,7 +25,7 @@
>>     cores <- as.integer(mc.cores)
>>     if(cores < 1L) stop("'mc.cores' must be >= 1")
>> -    if(cores == 1L) return(FUN(v, ...))
>> +    if(cores == 1L || !allowFork()) return(FUN(v, ...))
>>     .check_ncores(cores)
>>     if(mc.set.seed) mc.reset.stream()
>> with a new file src/library/parallel/R/unix/allowFork.R:
>> allowFork <- function(assert = FALSE) {
>>    value <- Sys.getenv("R_FORK_ALLOWED")
>>    if (nzchar(value)) {
>>        value <- switch(value,
>>           "1"=, "TRUE"=, "true"=, "True"=, "yes"=, "Yes"= TRUE,
>>           "0"=, "FALSE"=,"false"=,"False"=, "no"=, "No" = FALSE,
>>            stop(gettextf("invalid environment variable value: %s==%s",
>>           "R_FORK_ALLOWED", value)))
>> value <- as.logical(value)
>>    } else {
>>        value <- TRUE
>>    }
>>    value <- getOption("fork.allowed", value)
>>    if (is.na(value)) {
>>        stop(gettextf("invalid option value: %s==%s", "fork.allowed", value))
>>    }
>>    if (assert && !value) {
>>      stop(gettextf("Forked processing is not allowed per option %s or
>> environment variable %s", sQuote("fork.allowed"),
>> sQuote("R_FORK_ALLOWED")))
>>    }
>>    value
>> }
>> /Henrik
>>> On Mon, Apr 15, 2019 at 3:12 AM Tomas Kalibera <tomas.kalib...@gmail.com> 
>>> wrote:
>>> On 4/15/19 11:02 AM, Iñaki Ucar wrote:
>>>> On Mon, 15 Apr 2019 at 08:44, Tomas Kalibera <tomas.kalib...@gmail.com> 
>>>> wrote:
>>>>> On 4/13/19 12:05 PM, Iñaki Ucar wrote:
>>>>>> On Sat, 13 Apr 2019 at 03:51, Kevin Ushey <kevinus...@gmail.com> wrote:
>>>>>>> I think it's worth saying that mclapply() works as documented
>>>>>> Mostly, yes. But it says nothing about fork's copy-on-write and memory
>>>>>> overcommitment, and that this means that it may work nicely or fail
>>>>>> spectacularly depending on whether, e.g., you operate on a long
>>>>>> vector.
>>>>> R cannot possibly replicate documentation of the underlying operating
>>>>> systems. It clearly says that fork() is used and readers who may not
>>>>> know what fork() is need to learn it from external sources.
>>>>> Copy-on-write is an elementary property of fork().
>>>> Just to be precise, copy-on-write is an optimization widely deployed
>>>> in most modern *nixes, particularly for the architectures in which R
>>>> usually runs. But it is not an elementary property; it is not even
>>>> possible without an MMU.
>>> Yes, old Unix systems without virtual memory had fork eagerly copying.
>>> Not relevant today, and certainly not for systems that run R, but indeed
>>> people interested in OS internals can look elsewhere for more precise
>>> information.
>>> Tomas
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to