Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)

Henrik Bengtsson Tue, 04 Dec 2012 19:30:43 -0800

On Tue, Dec 4, 2012 at 5:25 PM, Simon Urbanek
<simon.urba...@r-project.org> wrote:
> A somewhat simplistic answer is that we already have that with the "mc.cores" 
> option. In multicore the default was to use all cores (without the need to 
> use detectCores) and yet you could reduce the number as you want with 
> mc.cores. This is similar to what you are talking about but it's not a 
> sufficient solution.
>
> There are some plans for somewhat more general approach. You may have noticed 
> that mcaffinity() was added to query/control/limit the mapping of cores to 
> tasks. It allows much more file-grained control and better decisions whether 
> to recursively split jobs or not as the state is global for the entire R. The 
> (vague) plan is to generalize this for all platforms - if not binding to a 
> particular core then at least to monitor the assigned number of cores.


I did not now about the concept of 'CPU affinity masks', but I can
quickly guess what the idea is, and it certainly provides a richer
control of CPU/core resources.  Yes, it would be very helpful if it
would work cross platform.

Thanks for the heads up.

/Henrik

>
> Cheers,
> Simon
>
>
> On Dec 4, 2012, at 3:24 PM, Henrik Bengtsson wrote:
>
>> In the 'parallel' package there is detectCores(), which tries its best
>> to infer the number of cores on the current machine.  This is useful
>> if you wish to utilize the *maximum* number of cores on the machine.
>> Several are using this to set the number of cores when parallelizing,
>> sometimes also hardcoded within 3rd-party scripts/package code, but
>> there are several settings where you wish to use fewer, e.g. in a
>> compute cluster where you R session is given only a portion of the
>> cores available.  Because of this, I'd like to propose to add
>> getCores(), which by default returns what detectCores() gives, but can
>> also be set to return what is assigned via setCores().  The idea is
>> this getCores() could replace most common usage of detectCores() and
>> provide more control.  An additional feature would be that 'parallel'
>> when loaded would check for command line argument --max-cores=<int>,
>> which will update the number of cores via setCores().  This would make
>> it possible for, say, a Torque/PBS compute cluster to launch an R
>> batch script as
>>
>>  Rscript --max-cores=$PBS_NP script.R
>>
>> and the only thing the script.R needs to know about is parallel::getCores().
>>
>> I understand that I can do all this already in my own scripts, but I'd
>> like to propose a standard for R.
>>
>> Comments?
>>
>> /Henrik
>>
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)

Reply via email to