The attached diff address the following issues in mclapply

mclapply coerces non-lists or objects (S3 or S4) to lists, but a list may not be an efficient representation and is not required if the object implements length, [, and [[ methods (lapply must also work on the object, either through coercion to a list at the 'inner.do' level or through other means, e.g., promoting lapply to a generic and writing a method specialized for the object). As written someone wishing to implement mclapply on an object not efficiently represented as a list would need to promote mclapply to a generic, and then re-implement an mclapply method for their object, rather than re-using the existing code.

mcparallel is not consistently invoked with a 'name' argument; a name seems to be superfluous to the code.

Creating the variable 'schedule' makes a full copy of a potentially large object in the master process; delaying until required in the inner.do function may (?) result in less copying.

Avoiding coercion to a list is similar to a suggestion for pvec.

  https://stat.ethz.ch/pipermail/r-devel/2012-October/065097.html

Martin
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793
Index: src/library/parallel/R/unix/mclapply.R
===================================================================
--- src/library/parallel/R/unix/mclapply.R	(revision 61084)
+++ src/library/parallel/R/unix/mclapply.R	(working copy)
@@ -47,15 +47,12 @@
         }
     }
     on.exit(cleanup())
-    ## Follow lapply
-    if(!is.vector(X) || is.object(X)) X <- as.list(X)
 
     if (!mc.preschedule) {              # sequential (non-scheduled)
         FUN <- match.fun(FUN)
         if (length(X) <= cores) { # we can use one-shot parallel
             jobs <- lapply(seq_along(X),
                            function(i) mcparallel(FUN(X[[i]], ...),
-                                                  name = names(X)[i],
                                                   mc.set.seed = mc.set.seed,
                                                   silent = mc.silent))
             res <- mccollect(jobs)
@@ -117,8 +114,6 @@
     if (cores < 2L) return(lapply(X = X, FUN = FUN, ...))
     sindex <- lapply(seq_len(cores),
                      function(i) seq(i, length(X), by = cores))
-    schedule <- lapply(seq_len(cores),
-                       function(i) X[seq(i, length(X), by = cores)])
     ch <- list()
     res <- vector("list", length(X))
     names(res) <- names(X)
@@ -126,13 +121,13 @@
     fin <- rep(FALSE, cores)
     dr <- rep(FALSE, cores)
     inner.do <- function(core) {
-        S <- schedule[[core]]
         f <- mcfork()
         if (isTRUE(mc.set.seed)) mc.advance.stream()
         if (inherits(f, "masterProcess")) { # this is the child process
             on.exit(mcexit(1L, structure("fatal error in wrapper code", class="try-error")))
             if (isTRUE(mc.set.seed)) mc.set.stream()
             if (isTRUE(mc.silent)) closeStdout()
+            S <- X[sindex[[core]]]
             sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))
             mcexit(0L)
         }
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to