On Fri, Dec 21, 2012 at 10:42 AM, Chris Hergarten <cheg...@yahoo.com> wrote: > Dear R-users > > I was running into problems with my R code trying to run clh sampling (clhs > package) in parallel mode (=on various data sets simultaneously). > > Here is the code (which I developed with some help:)): > ****************************************** > library("clhs") > library("snow") > a <- as.data.frame(replicate(1000, rnorm(20))) > b <- as.data.frame(replicate(1000, rnorm(20))) > c <- as.data.frame(replicate(1000, rnorm(20))) > d <- as.data.frame(replicate(1000, rnorm(20))) > abcd <- list(a, b, c, d) > cl <- makeCluster(4) > results <- parLapply(cl, > X = abcd, > FUN = function(i) { > clhs(x = i, size = round(nrow(i) / 5), iter = 2000, simple = FALSE) > }, > ) > stopCluster(cl) > ****************************************** > > Before running the last line, R is throwing an error: "Error in length(x) : > 'x' is missing". Any ideas what I am doing wrong and how to improve? >
Loading clhs on the primary does not automatically load it on the workers. Try: clusterEvalQ(cl, library(clhs)) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.