Greetings, I am having an interesting problem and I wonder if anyone else has seen this behavior.
I am running R 2.11.1 with SNOW 0.3-3 on a Dell cluster running CentOS 5.5. I create my cluster using: cluster<- makeCluster(nodes,type="SOCK",port=10191) # nodes is a vector of compute nodes I then wrap a loop around clusterApplyLB to evaluate my function multiple times, with different parameters, without recreating the cluster every time. The following code segment shows what I am trying to do: for (j in loopstart:loopend) { call.m=list( step1=T, dat=x.m[,c(1:7, j)] # x.m is data from a csv file read into a table ) clusterApplyLB(cluster,c(10:100),test.each.term,call=call.m) } stopCluster(cluster) The problem that I am having is that sometimes it will run for 50 iterations of this loop then crash. Sometimes 15 iterations, sometimes 2. When the crash happens, I receive the following error message every time: Error in checkForRemoteErrors(val) : one node produced an error: cannot open the connection Calls: clusterApplyLB -> dynamicClusterApply -> checkForRemoteErrors Execution halted Any ideas as to what might be going on? I have run this code successfully many times when I do not use the loop. I have a lot of data to process and recreating the cluster every time that I want to run my function is a waste of time. Thanx, Ken "For I know the plans I have for you," declares the LORD, "plans to prosper you and not to harm you, plans to give you hope and a future." - Jeremiah 29:11 Check out the website http://www.amazingfacts.org/ for answers to Bible questions that will change your life. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.