Re: [Rd] Speeding up R (was Using multicores in R)
For info, I put a little study I did about the byte code compiler and other speedup approaches (but not multicore) on the Rwiki at http://rwiki.sciviews.org/doku.php?id=tips:rqcasestudy which looks at a specific problem, so may not be relevant to everyone. However, one of my reasons for doing it was to document the "how to" a little. JN 2. Have you tried the "compiler" package? If I understand correctly, R is a two-stage interpreter, first translating what we know as R into byte code, which is then interpreted by a byte code interpreter. If my memory is correct, this approach can cut the compute time by a factor of 100. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)
In the 'parallel' package there is detectCores(), which tries its best to infer the number of cores on the current machine. This is useful if you wish to utilize the *maximum* number of cores on the machine. Several are using this to set the number of cores when parallelizing, sometimes also hardcoded within 3rd-party scripts/package code, but there are several settings where you wish to use fewer, e.g. in a compute cluster where you R session is given only a portion of the cores available. Because of this, I'd like to propose to add getCores(), which by default returns what detectCores() gives, but can also be set to return what is assigned via setCores(). The idea is this getCores() could replace most common usage of detectCores() and provide more control. An additional feature would be that 'parallel' when loaded would check for command line argument --max-cores=, which will update the number of cores via setCores(). This would make it possible for, say, a Torque/PBS compute cluster to launch an R batch script as Rscript --max-cores=$PBS_NP script.R and the only thing the script.R needs to know about is parallel::getCores(). I understand that I can do all this already in my own scripts, but I'd like to propose a standard for R. Comments? /Henrik __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] inconsistencies between ?class and ?UseMethod
Hi, The 2 man pages give inconsistent description of class(): Found in ?class: If the object does not have a class attribute, it has an implicit class, ‘"matrix"’, ‘"array"’ or the result of ‘mode(x)’ (except that integer vectors have implicit class ‘"integer"’). Found in ?UseMethod: Matrices and arrays have class ‘"matrix"’ or‘"array"’ followed by the class of the underlying vector. Most vectors have class the result of ‘mode(x)’, except that integer vectors have class ‘c("integer", "numeric")’ and real vectors have class ‘c("double", "numeric")’. So according to ?UseMethod, class(matrix(1:4)) should be c("matrix", "integer", "numeric"), which is of course not the case: > class(matrix(1:4)) [1] "matrix" I wonder if this was ever true, and, if so, when and why it has changed. Anyway, an update to ?UseMethod would be welcome. Or, documenting class() in only 1 place seems even better (more DRY principle). Thanks, H. -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)
A somewhat simplistic answer is that we already have that with the "mc.cores" option. In multicore the default was to use all cores (without the need to use detectCores) and yet you could reduce the number as you want with mc.cores. This is similar to what you are talking about but it's not a sufficient solution. There are some plans for somewhat more general approach. You may have noticed that mcaffinity() was added to query/control/limit the mapping of cores to tasks. It allows much more file-grained control and better decisions whether to recursively split jobs or not as the state is global for the entire R. The (vague) plan is to generalize this for all platforms - if not binding to a particular core then at least to monitor the assigned number of cores. Cheers, Simon On Dec 4, 2012, at 3:24 PM, Henrik Bengtsson wrote: > In the 'parallel' package there is detectCores(), which tries its best > to infer the number of cores on the current machine. This is useful > if you wish to utilize the *maximum* number of cores on the machine. > Several are using this to set the number of cores when parallelizing, > sometimes also hardcoded within 3rd-party scripts/package code, but > there are several settings where you wish to use fewer, e.g. in a > compute cluster where you R session is given only a portion of the > cores available. Because of this, I'd like to propose to add > getCores(), which by default returns what detectCores() gives, but can > also be set to return what is assigned via setCores(). The idea is > this getCores() could replace most common usage of detectCores() and > provide more control. An additional feature would be that 'parallel' > when loaded would check for command line argument --max-cores=, > which will update the number of cores via setCores(). This would make > it possible for, say, a Torque/PBS compute cluster to launch an R > batch script as > > Rscript --max-cores=$PBS_NP script.R > > and the only thing the script.R needs to know about is parallel::getCores(). > > I understand that I can do all this already in my own scripts, but I'd > like to propose a standard for R. > > Comments? > > /Henrik > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] RInside, rcpp compilation problem
I have spent some hours browsing the RInside and rcpp documentation, lots of it; but ... as a programmer of C++ since 1990, on both Windows and Unix ... ( Solaris and Ubuntu, and Mandrake/Mandrivo Linux); I see a minor problem .. Where is the rcpp.h header file?? The below code fails to compile as the RInside.h header file references the rcpp.h header file, which is not included with RInclude download. This is the sample code provided in one of the RInside manuals: #include #include // for the embedded R via RInside rcpp::NumericMatrix createMatrix(const int n) { Rcpp::NumericMatrix M(n,n); for (int i=0; ihttps://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)
On Tue, Dec 4, 2012 at 5:25 PM, Simon Urbanek wrote: > A somewhat simplistic answer is that we already have that with the "mc.cores" > option. In multicore the default was to use all cores (without the need to > use detectCores) and yet you could reduce the number as you want with > mc.cores. This is similar to what you are talking about but it's not a > sufficient solution. > > There are some plans for somewhat more general approach. You may have noticed > that mcaffinity() was added to query/control/limit the mapping of cores to > tasks. It allows much more file-grained control and better decisions whether > to recursively split jobs or not as the state is global for the entire R. The > (vague) plan is to generalize this for all platforms - if not binding to a > particular core then at least to monitor the assigned number of cores. I did not now about the concept of 'CPU affinity masks', but I can quickly guess what the idea is, and it certainly provides a richer control of CPU/core resources. Yes, it would be very helpful if it would work cross platform. Thanks for the heads up. /Henrik > > Cheers, > Simon > > > On Dec 4, 2012, at 3:24 PM, Henrik Bengtsson wrote: > >> In the 'parallel' package there is detectCores(), which tries its best >> to infer the number of cores on the current machine. This is useful >> if you wish to utilize the *maximum* number of cores on the machine. >> Several are using this to set the number of cores when parallelizing, >> sometimes also hardcoded within 3rd-party scripts/package code, but >> there are several settings where you wish to use fewer, e.g. in a >> compute cluster where you R session is given only a portion of the >> cores available. Because of this, I'd like to propose to add >> getCores(), which by default returns what detectCores() gives, but can >> also be set to return what is assigned via setCores(). The idea is >> this getCores() could replace most common usage of detectCores() and >> provide more control. An additional feature would be that 'parallel' >> when loaded would check for command line argument --max-cores=, >> which will update the number of cores via setCores(). This would make >> it possible for, say, a Torque/PBS compute cluster to launch an R >> batch script as >> >> Rscript --max-cores=$PBS_NP script.R >> >> and the only thing the script.R needs to know about is parallel::getCores(). >> >> I understand that I can do all this already in my own scripts, but I'd >> like to propose a standard for R. >> >> Comments? >> >> /Henrik >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RInside, rcpp compilation problem
On Dec 4, 2012, at 10:00 PM, Jeff Goode wrote: > I have spent some hours browsing the RInside and rcpp documentation, lots of > it; but ... as a programmer of C++ since 1990, on both Windows and Unix ... ( > Solaris and Ubuntu, and Mandrake/Mandrivo Linux); I see a minor problem > .. Where is the rcpp.h header file?? In the Rcpp package which RInside links to. Please use rcpp-devel mailing list for such questions (as per request of the authors). Cheers, Simon > The below code fails to compile as the RInside.h header file references the > rcpp.h header file, which is not included with RInclude download. This is the > sample code provided in one of the RInside manuals: > #include > > #include // for the embedded R via > RInside > rcpp::NumericMatrix createMatrix(const int n) { >Rcpp::NumericMatrix M(n,n); >for (int i=0; ifor (int j=0; jM(i,j) = i*10 + j; >} >} >return(M); > } > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RInside, rcpp compilation problem
On 4 December 2012 at 22:47, Simon Urbanek wrote: | | On Dec 4, 2012, at 10:00 PM, Jeff Goode wrote: | | > I have spent some hours browsing the RInside and rcpp documentation, lots of it; but ... as a programmer of C++ since 1990, on both Windows and Unix ... ( Solaris and Ubuntu, and Mandrake/Mandrivo Linux); I see a minor problem .. Where is the rcpp.h header file?? | | In the Rcpp package which RInside links to. Correct. And Depends: upon. | Please use rcpp-devel mailing list for such questions (as per request of the authors). Mostly as a courtesy to readers of r-devel. And several different folks may respond via rcpp-devel, not all of whom read here as well. So redirecting via CC:, please feel free to keep follow-up there (but you need to be subscribed to post). | > The below code fails to compile as the RInside.h header file references the rcpp.h header file, which is not included with RInclude download. This is the sample code provided in one of the RInside manuals: | > #include | > | > #include // for the embedded R via RInside | > rcpp::NumericMatrix createMatrix(const int n) { | >Rcpp::NumericMatrix M(n,n); | >for (int i=0; ifor (int j=0; jM(i,j) = i*10 + j; | >} | >} | >return(M); | > } You appear to have clipped this from examples/standard/rinside_sample1.cpp That very directory examples/standard, and its neighbouring directories, each have i) a Makefile for Linux, OS X, ... and ii) a Makefile.win for Win*, and iii) contributed cmake/ files all of which do the build. RInside needs itself, Rcpp and R so a few -I and -L switches need to set --- which those three alternatives do for you. So if you, say, do 'cp rinside_sample1.cpp jeff1.cpp' you can just say 'make jeff1' and the executable will be built. That is a feature. The Makefile should work for your projects, and the other makefiles in the neighbouring directories show how to do this with MPI, Qt, Wt, and (in SVN) Boost. Dirk -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] NAMESPACE problem: import(zoo) but 'zoo' could not be loaded
Hello: I'm having problems creating a real NAMESPACE to replace the pro forma one in the fda package on R-Forge. "R CMD check" complains, "Error: package 'zoo' could not be loaded ... there is no package called 'zoo'"; see below. I get this both with and without "import(zoo)" in NAMESPACE. Suggestions? Thanks, Spencer p.s. The current code including this problem can be obtained through anonymous access via "svn checkout svn://svn.r-forge.r-project.org/svnroot/fda/". C:\Users\sgraves\2012\R_pkgs\fda>R CMD check fda_2.3.3.tar.gz * using log directory 'C:/Users/sgraves/2012/R_pkgs/fda/fda.Rcheck' * using R version 2.15.2 (2012-10-26) * using platform: i386-w64-mingw32 (32-bit) * checking loading without being on the library search path ... WARNING Loading required package: splines Loading required package: zoo Error: package 'zoo' could not be loaded In addition: Warning message: In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = lib.loc) : there is no package called 'zoo' Execution halted It looks like this package has a loading problem when not on .libPaths: see the messages for details. > sessionInfo() R version 2.15.2 (2012-10-26) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] zoo_1.7-9 loaded via a namespace (and not attached): [1] grid_2.15.2 lattice_0.20-10 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel