Re: [R] getting started in parallel computing on a windows OS

Benjamin Caldwell Mon, 29 Apr 2013 15:35:08 -0700

Martin,

This worked, thanks again!


*Ben Caldwell*

Graduate Fellow
University of California, Berkeley
130 Mulford Hall #3114
Berkeley, CA 94720
Office 223 Mulford Hall
(510)859-3358


On Thu, Apr 25, 2013 at 10:04 PM, Benjamin Caldwell <btcaldw...@berkeley.edu
> wrote:

> Thanks for this martin. I'll start retooling and let you know how it goes.
>
> Ben Caldwell
> Graduate fellow
> On Apr 24, 2013 4:34 PM, "Martin Morgan" <mtmor...@fhcrc.org> wrote:
>
>> On 04/24/2013 02:50 PM, Benjamin Caldwell wrote:
>>
>>> Dear R help,
>>>
>>> I've what I think is a fairly simple parallel problem, and am getting
>>> bogged down in documentation and packages for much more complex
>>> situations.
>>>
>>> I have a big matrix  (30^5,5]. I have a function that will act on each
>>> row
>>> of that matrix sequentially and output the 'best' result from the whole
>>> matrix (it compares the result from each row to the last and keeps the
>>> 'better' result). I would like to divide that first large matrix into
>>> chunks equal to the number of cores I have available to me, and work
>>> through each chunk, then output the results from each chunk.
>>>
>>> I'm really having trouble making head or tail of how to do this on a
>>> windows machine - lots of different false starts on several different
>>> packages now. Basically, I have the function, and I can of course easily
>>> divide the matrix into chunks. I just need a way to process each chunk
>>> in parallel (other than opening new R sessions for each core manually).
>>>
>>> Any help much appreciated - after two days of trying to get this to work
>>> I'm pretty burnt out.
>>>
>>
>> Hi Ben -- in your code from this morning you had a function
>>
>> fitting <- function(ndx.grd=two,dt.grd=**one,ind.vr='ind',rsp.vr='res') {
>>     ## ... setup
>>     for(i in 1:length(ndx.grd[,1])){
>>         ## ... do work
>>     }
>>     ## ... collate results
>> }
>>
>> that you're trying to run in parallel. Obviously the ## ... represent
>> lines I've removed. When you say something like
>>
>> y <- foreach(icount(length(two))) %dopar% fitting()
>>
>> its saying that you want to run fitting() length(two) times. So you're
>> actually doing the same thing length(two) times, whereas you really want to
>> divide the work thats inside fitting() into chunks, and do those on
>> separate cores!
>>
>> Conceptually what you'd like to do is
>>
>> fit_one <- function(idx, ndx.grd, dt.grd, ind.vr, rsp.vr) {
>>     ## ... do work on row idx _ONLY_
>> }
>>
>> and then evaluate with
>>
>> ## ... setup
>> y <-
>>   foreach (idx = icount(nrow(two)) %dopar% one_fit(idx, two, one, "ind",
>> "res")
>> ## ... collate
>>
>> so that fit_one fits just one of your combinations. foreach will worry
>> about distributing the work. Make sure that fit_one works first, before
>> trying to run this in parallel; your use of try(), trying to fit different
>> data types (character, integer, numeric) into a matrix rather than
>> data.frame, and the type coercions all indicate that you're fighting with R
>> rather than working with it.
>>
>> Hope that helps,
>>
>> Martin
>>
>>
>>> Thanks
>>>
>>> *Ben Caldwell*
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________**________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> --
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getting started in parallel computing on a windows OS

Reply via email to