Re: [Rd] Speeding up R (was Using multicores in R)

2012-12-04 Thread Prof J C Nash (U30A)
For info, I put a little study I did about the byte code compiler and 
other speedup approaches (but not multicore) on the Rwiki at


http://rwiki.sciviews.org/doku.php?id=tips:rqcasestudy

which looks at a specific problem, so may not be relevant to everyone.
However, one of my reasons for doing it was to document the "how to" a 
little.


JN



   2.  Have you tried the "compiler" package?  If I understand
correctly, R is a two-stage interpreter, first translating what we know
as R into byte code, which is then interpreted by a byte code
interpreter.  If my memory is correct, this approach can cut the compute
time by a factor of 100.





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)

2012-12-04 Thread Henrik Bengtsson
In the 'parallel' package there is detectCores(), which tries its best
to infer the number of cores on the current machine.  This is useful
if you wish to utilize the *maximum* number of cores on the machine.
Several are using this to set the number of cores when parallelizing,
sometimes also hardcoded within 3rd-party scripts/package code, but
there are several settings where you wish to use fewer, e.g. in a
compute cluster where you R session is given only a portion of the
cores available.  Because of this, I'd like to propose to add
getCores(), which by default returns what detectCores() gives, but can
also be set to return what is assigned via setCores().  The idea is
this getCores() could replace most common usage of detectCores() and
provide more control.  An additional feature would be that 'parallel'
when loaded would check for command line argument --max-cores=,
which will update the number of cores via setCores().  This would make
it possible for, say, a Torque/PBS compute cluster to launch an R
batch script as

  Rscript --max-cores=$PBS_NP script.R

and the only thing the script.R needs to know about is parallel::getCores().

I understand that I can do all this already in my own scripts, but I'd
like to propose a standard for R.

Comments?

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] inconsistencies between ?class and ?UseMethod

2012-12-04 Thread Hervé Pagès

Hi,

The 2 man pages give inconsistent description of class():

Found in ?class:

 If the object does not have a class attribute, it has an implicit
 class, ‘"matrix"’, ‘"array"’ or the result of ‘mode(x)’ (except
 that integer vectors have implicit class ‘"integer"’).

Found in ?UseMethod:

 Matrices and arrays have class ‘"matrix"’ or‘"array"’ followed
 by the class of the underlying vector.
 Most vectors have class the result of ‘mode(x)’, except that
 integer vectors have class ‘c("integer", "numeric")’ and real
 vectors have class ‘c("double", "numeric")’.

So according to ?UseMethod, class(matrix(1:4)) should be
c("matrix", "integer", "numeric"), which is of course not the case:

  > class(matrix(1:4))
  [1] "matrix"

I wonder if this was ever true, and, if so, when and why it has changed.

Anyway, an update to ?UseMethod would be welcome. Or, documenting
class() in only 1 place seems even better (more DRY principle).

Thanks,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)

2012-12-04 Thread Simon Urbanek
A somewhat simplistic answer is that we already have that with the "mc.cores" 
option. In multicore the default was to use all cores (without the need to use 
detectCores) and yet you could reduce the number as you want with mc.cores. 
This is similar to what you are talking about but it's not a sufficient 
solution.

There are some plans for somewhat more general approach. You may have noticed 
that mcaffinity() was added to query/control/limit the mapping of cores to 
tasks. It allows much more file-grained control and better decisions whether to 
recursively split jobs or not as the state is global for the entire R. The 
(vague) plan is to generalize this for all platforms - if not binding to a 
particular core then at least to monitor the assigned number of cores.

Cheers,
Simon


On Dec 4, 2012, at 3:24 PM, Henrik Bengtsson wrote:

> In the 'parallel' package there is detectCores(), which tries its best
> to infer the number of cores on the current machine.  This is useful
> if you wish to utilize the *maximum* number of cores on the machine.
> Several are using this to set the number of cores when parallelizing,
> sometimes also hardcoded within 3rd-party scripts/package code, but
> there are several settings where you wish to use fewer, e.g. in a
> compute cluster where you R session is given only a portion of the
> cores available.  Because of this, I'd like to propose to add
> getCores(), which by default returns what detectCores() gives, but can
> also be set to return what is assigned via setCores().  The idea is
> this getCores() could replace most common usage of detectCores() and
> provide more control.  An additional feature would be that 'parallel'
> when loaded would check for command line argument --max-cores=,
> which will update the number of cores via setCores().  This would make
> it possible for, say, a Torque/PBS compute cluster to launch an R
> batch script as
> 
>  Rscript --max-cores=$PBS_NP script.R
> 
> and the only thing the script.R needs to know about is parallel::getCores().
> 
> I understand that I can do all this already in my own scripts, but I'd
> like to propose a standard for R.
> 
> Comments?
> 
> /Henrik
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] RInside, rcpp compilation problem

2012-12-04 Thread Jeff Goode




I have spent some hours browsing the RInside and rcpp documentation, lots of 
it; but ... as a programmer of C++ since 1990, on both Windows and Unix ... ( 
Solaris and Ubuntu, and Mandrake/Mandrivo Linux); I see a minor problem ..  
Where is the rcpp.h header file??  The below code fails to compile as the 
RInside.h header file references the rcpp.h header file, which is not included 
with RInclude download. This is the sample code provided in one of the RInside 
manuals: 
#include  

#include // for the embedded R via 
RInside
rcpp::NumericMatrix createMatrix(const int n) {
Rcpp::NumericMatrix M(n,n);
for (int i=0; ihttps://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)

2012-12-04 Thread Henrik Bengtsson
On Tue, Dec 4, 2012 at 5:25 PM, Simon Urbanek
 wrote:
> A somewhat simplistic answer is that we already have that with the "mc.cores" 
> option. In multicore the default was to use all cores (without the need to 
> use detectCores) and yet you could reduce the number as you want with 
> mc.cores. This is similar to what you are talking about but it's not a 
> sufficient solution.
>
> There are some plans for somewhat more general approach. You may have noticed 
> that mcaffinity() was added to query/control/limit the mapping of cores to 
> tasks. It allows much more file-grained control and better decisions whether 
> to recursively split jobs or not as the state is global for the entire R. The 
> (vague) plan is to generalize this for all platforms - if not binding to a 
> particular core then at least to monitor the assigned number of cores.

I did not now about the concept of 'CPU affinity masks', but I can
quickly guess what the idea is, and it certainly provides a richer
control of CPU/core resources.  Yes, it would be very helpful if it
would work cross platform.

Thanks for the heads up.

/Henrik

>
> Cheers,
> Simon
>
>
> On Dec 4, 2012, at 3:24 PM, Henrik Bengtsson wrote:
>
>> In the 'parallel' package there is detectCores(), which tries its best
>> to infer the number of cores on the current machine.  This is useful
>> if you wish to utilize the *maximum* number of cores on the machine.
>> Several are using this to set the number of cores when parallelizing,
>> sometimes also hardcoded within 3rd-party scripts/package code, but
>> there are several settings where you wish to use fewer, e.g. in a
>> compute cluster where you R session is given only a portion of the
>> cores available.  Because of this, I'd like to propose to add
>> getCores(), which by default returns what detectCores() gives, but can
>> also be set to return what is assigned via setCores().  The idea is
>> this getCores() could replace most common usage of detectCores() and
>> provide more control.  An additional feature would be that 'parallel'
>> when loaded would check for command line argument --max-cores=,
>> which will update the number of cores via setCores().  This would make
>> it possible for, say, a Torque/PBS compute cluster to launch an R
>> batch script as
>>
>>  Rscript --max-cores=$PBS_NP script.R
>>
>> and the only thing the script.R needs to know about is parallel::getCores().
>>
>> I understand that I can do all this already in my own scripts, but I'd
>> like to propose a standard for R.
>>
>> Comments?
>>
>> /Henrik
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RInside, rcpp compilation problem

2012-12-04 Thread Simon Urbanek

On Dec 4, 2012, at 10:00 PM, Jeff Goode wrote:

> I have spent some hours browsing the RInside and rcpp documentation, lots of 
> it; but ... as a programmer of C++ since 1990, on both Windows and Unix ... ( 
> Solaris and Ubuntu, and Mandrake/Mandrivo Linux); I see a minor problem 
> ..  Where is the rcpp.h header file??  

In the Rcpp package which RInside links to. Please use rcpp-devel mailing list 
for such questions (as per request of the authors).

Cheers,
Simon


> The below code fails to compile as the RInside.h header file references the 
> rcpp.h header file, which is not included with RInclude download. This is the 
> sample code provided in one of the RInside manuals: 
> #include  
> 
> #include // for the embedded R via 
> RInside
> rcpp::NumericMatrix createMatrix(const int n) {
>Rcpp::NumericMatrix M(n,n);
>for (int i=0; ifor (int j=0; jM(i,j) = i*10 + j; 
>}
>}
>return(M);
> }   
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RInside, rcpp compilation problem

2012-12-04 Thread Dirk Eddelbuettel

On 4 December 2012 at 22:47, Simon Urbanek wrote:
| 
| On Dec 4, 2012, at 10:00 PM, Jeff Goode wrote:
| 
| > I have spent some hours browsing the RInside and rcpp documentation, lots 
of it; but ... as a programmer of C++ since 1990, on both Windows and Unix ... 
( Solaris and Ubuntu, and Mandrake/Mandrivo Linux); I see a minor problem 
..  Where is the rcpp.h header file??  
| 
| In the Rcpp package which RInside links to. 

Correct. And Depends: upon.

| Please use rcpp-devel mailing list for such questions (as per request of the 
authors).

Mostly as a courtesy to readers of r-devel.  And several different folks may
respond via rcpp-devel, not all of whom read here as well.

So redirecting via CC:, please feel free to keep follow-up there (but you
need to be subscribed to post).

| > The below code fails to compile as the RInside.h header file references the 
rcpp.h header file, which is not included with RInclude download. This is the 
sample code provided in one of the RInside manuals: 
| > #include  
| > 
| > #include // for the embedded R via 
RInside
| > rcpp::NumericMatrix createMatrix(const int n) {
| >Rcpp::NumericMatrix M(n,n);
| >for (int i=0; ifor (int j=0; jM(i,j) = i*10 + j; 
| >}
| >}
| >return(M);
| > } 

You appear to have clipped this from examples/standard/rinside_sample1.cpp

That very directory examples/standard, and its neighbouring directories, each
have

i)   a Makefile for Linux, OS X, ... and 

ii)  a Makefile.win for Win*, and

iii) contributed cmake/ files

all of which do the build.  RInside needs itself, Rcpp and R so a few -I and
-L switches need to set --- which those three alternatives do for you.  

So if you, say, do 'cp rinside_sample1.cpp jeff1.cpp' you can just say 
'make jeff1' and the executable will be built.  That is a feature.  The
Makefile should work for your projects, and the other makefiles in the
neighbouring directories show how to do this with MPI, Qt, Wt, and (in SVN)
Boost. 

Dirk

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] NAMESPACE problem: import(zoo) but 'zoo' could not be loaded

2012-12-04 Thread Spencer Graves

Hello:


  I'm having problems creating a real NAMESPACE to replace the pro 
forma one in the fda package on R-Forge.  "R CMD check" complains, 
"Error: package 'zoo' could not be loaded ... there is no package called 
'zoo'";  see below.  I get this both with and without "import(zoo)" in 
NAMESPACE.



  Suggestions?
  Thanks,
  Spencer


p.s.  The current code including this problem can be obtained through 
anonymous access via "svn checkout 
svn://svn.r-forge.r-project.org/svnroot/fda/".



C:\Users\sgraves\2012\R_pkgs\fda>R CMD check fda_2.3.3.tar.gz
* using log directory 'C:/Users/sgraves/2012/R_pkgs/fda/fda.Rcheck'
* using R version 2.15.2 (2012-10-26)
* using platform: i386-w64-mingw32 (32-bit)



* checking loading without being on the library search path ... WARNING
Loading required package: splines
Loading required package: zoo
Error: package 'zoo' could not be loaded
In addition: Warning message:
In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = 
lib.loc)

 :
  there is no package called 'zoo'
Execution halted

It looks like this package has a loading problem when not on .libPaths:
see the messages for details.




> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

other attached packages:
[1] zoo_1.7-9

loaded via a namespace (and not attached):
[1] grid_2.15.2 lattice_0.20-10

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel