The attached diff address the following issues in mclapply
mclapply coerces non-lists or objects (S3 or S4) to lists, but a list may not be an efficient representation and is not required if the object implements length, [, and [[ methods (lapply must also work on the object, either through coercion to a list at the 'inner.do' level or through other means, e.g., promoting lapply to a generic and writing a method specialized for the object). As written someone wishing to implement mclapply on an object not efficiently represented as a list would need to promote mclapply to a generic, and then re-implement an mclapply method for their object, rather than re-using the existing code.
mcparallel is not consistently invoked with a 'name' argument; a name seems to be superfluous to the code.
Creating the variable 'schedule' makes a full copy of a potentially large object in the master process; delaying until required in the inner.do function may (?) result in less copying.
Avoiding coercion to a list is similar to a suggestion for pvec. https://stat.ethz.ch/pipermail/r-devel/2012-October/065097.html Martin -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
Index: src/library/parallel/R/unix/mclapply.R =================================================================== --- src/library/parallel/R/unix/mclapply.R (revision 61084) +++ src/library/parallel/R/unix/mclapply.R (working copy) @@ -47,15 +47,12 @@ } } on.exit(cleanup()) - ## Follow lapply - if(!is.vector(X) || is.object(X)) X <- as.list(X) if (!mc.preschedule) { # sequential (non-scheduled) FUN <- match.fun(FUN) if (length(X) <= cores) { # we can use one-shot parallel jobs <- lapply(seq_along(X), function(i) mcparallel(FUN(X[[i]], ...), - name = names(X)[i], mc.set.seed = mc.set.seed, silent = mc.silent)) res <- mccollect(jobs) @@ -117,8 +114,6 @@ if (cores < 2L) return(lapply(X = X, FUN = FUN, ...)) sindex <- lapply(seq_len(cores), function(i) seq(i, length(X), by = cores)) - schedule <- lapply(seq_len(cores), - function(i) X[seq(i, length(X), by = cores)]) ch <- list() res <- vector("list", length(X)) names(res) <- names(X) @@ -126,13 +121,13 @@ fin <- rep(FALSE, cores) dr <- rep(FALSE, cores) inner.do <- function(core) { - S <- schedule[[core]] f <- mcfork() if (isTRUE(mc.set.seed)) mc.advance.stream() if (inherits(f, "masterProcess")) { # this is the child process on.exit(mcexit(1L, structure("fatal error in wrapper code", class="try-error"))) if (isTRUE(mc.set.seed)) mc.set.stream() if (isTRUE(mc.silent)) closeStdout() + S <- X[sindex[[core]]] sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) mcexit(0L) }
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel