Another thing to realize if you are doing this in parallel, f(x[i,]) is being executed on each of the worker R sessions. Now, a big.matrix object is essentially a pointer to an object created in C++ and, a pointer address space is specific to a process (in this case the master R session). As a result, when you call f(x[i,]) in each of the workers, you are trying to use a pointer that no longer points to the original C++ object. To get around this you need to either use a shared.big.matrix or filebacked.big.matrix object and you need to use the descriptor in the foreach function.
desc = describe(x) foreach (i=1:nrow(x), .combine=c, .packages='bigmemory') %dopar% { x = attach.big.matrix(desc) f(x[i,]) } The descriptor is just a list of information that is not process specific. So, in the example, the descriptor is passed to each of the workers. Then the worker attaches to the big.matrix object which is shared across R sessions, and then function is called on the attached big.matrix object. I may have provided more detail than you were interested. The thing to take away is that if you are working with foreach and bigmemory in parallel, you need to use descriptors to the workers and attach to the big.matrix object. Thanks, Mike On Sun, Jul 19, 2009 at 8:29 AM, Jay Emerson <jayemer...@gmail.com> wrote: > Michael, > > If you have a big.matrix, you just want to iterate over the rows. I'm not > in R and am just making this up on the fly (from a bar in Beijing, if you > believe that): > > foreach(i=1:nrow(x),.combine=c) %dopar% f(x[i,]) > > should work, essentially applying the functin f() to the rows of x? But > perhaps I misunderstand you. Please feel free to email me or Mike ( > michael.k...@yale.edu) directoy with questions about bigmemory, we are > very interested in applications of it to real problems. > > Note that the package foreach uses package iterators, and is very flexible, > in case you need more general iteration in parellel. > > Regards, > > Jay > > > > Original message: > Hi there! > I have become a big fan of the 'foreach' package allowing me to do a > lot of stuff in parallel. For example, evaluating the function f on > all elements in a vector x is easily accomplished: > foreach(i=1:length(x),.combine=c) %dopar% f(x[i]) > Here the .combine=c option tells foreach to combine output using the > c()-function. That is, to return it as a vector. > Today I discovered the 'bigmemory' package, and I would like to > contruct a big.matrix in a parralel fashion row by row. To use foreach > I see no other way than to come up with a substitute for c in the > .combine option. I have checked out the big.matrix manual, but I can't > find a function suitable for just that. > Actually, I wouldn't even know how to do it for a usual matrix. Any clues? > Thanks! > -- > Michael Knudsen > micknud...@gmail.com > http://lifeofknudsen.blogspot.com/ > > > > -- > John W. Emerson (Jay) > Associate Professor of Statistics > Department of Statistics > Yale University > http://www.stat.yale.edu/~jay <http://www.stat.yale.edu/%7Ejay> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.