Hello I would like to use SNOW to parallelise some computations to be made on columns of a data frame, using different parameter values for each SNOW "worker".
I gather(?) clusterMap() is the appropriate SNOW function to do something like this. I suspect the problem lies in the fact that I am only supplying one data frame argument for the flow.dat function argument yet the a, b, and x arguments have ten values each. I tried with RECYCLE=TRUE but still didn't work. I have generated some example data below that illustrates my problem. #example input data frames mydat <- data.frame(a.in=1:10,b.in=1:10,x.in=1:10) flow.dat <- data.frame(ww=100:105,zz=600:605) #define the function myfun<- function(a,b,x,flow.dat){ + ee <- a+b+x + ff<- mean(flow.dat[,1]) + return(ff) + } #apply the function as per normal print(myfun(a=mydat$a.in, + b=mydat$b.in, + x=mydat$x.in, + flow.dat=flow.dat)) [1] 102.5 #works OK, average of column one of data frame looks good #a,b and x parameters read in OK , ee gets calculated but not returned #now try to apply the function in parallel via SNOW cl <- makeCluster(3,type="SOCK") #make a cluster ll <- clusterMap(cl,fun=myfun, + a=mydat$a.in, + b=mydat$b.in, + x=mydat$x.in, + flow.dat=flow.dat) >Error in checkForRemoteErrors(val) : 10 nodes produced errors; first error: incorrect number of dimensions stopCluster(cl) _______________________________________________________ Here is system info > Sys.info() sysname release version nodename "Windows" "Server 2008 x64" "build 7601, Service Pack 1" "POWERAPP4-WRON" machine login user "x86-64" "xxxxxx" "xxxxxx" $version.string [1] "R version 2.12.1 (2010-12-16)" Paul Rustomji Research Scientist CSIRO Land and Water Australia [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.