Re: [Rd] R process killed when allocating too large matrix (Mac OS X)
> Kirill Müller > on Wed, 11 May 2016 10:42:56 +0200 writes: > My ulimit package exposes this API ([1], should finally submit it to > CRAN); unfortunately this very API seems to be unsupported on OS X > [2,3]. Last time I looked into it, neither of the documented settings > achieved the desired effect. > -Kirill > [1] http://krlmlr.github.io/ulimit > [2] > http://stackoverflow.com/questions/3274385/how-to-limit-memory-of-a-os-x-program-ulimit-v-neither-m-are-working > [3] > https://developer.apple.com/library/ios/documentation/System/Conceptual/ManPages_iPhoneOS/man2/getrlimit.2.html > On 10.05.2016 01:08, Jeroen Ooms wrote: >> On 05/05/2016 10:11, Uwe Ligges wrote: >>> Actually this also happens under Linux and I had my R processes killed >>> more than once (and much worse also other processes so that we had to >>> reboot a server, essentially). I agree that Linux is not consistently fine here either. >> I found that setting RLIMIT_AS [1] works very well on Linux. But this >> requires that you cap memory to some fixed value. conceivably one could set a default cap, using something equivalent to the data in sfsmisc::Sys.meminfo() or sfsmisc::Sys.memGB() (very simple, > 10 year old interfaces, based on the Linux-only (?) '/proc/*' filesystem). In an ideal word, some of us, from R core, Jeroen, Kyrill, , maintainer("microbenchmark>, ... would sit together and devise an R function interface (based on low level platform specific interfaces, specifically for at least Linux/POSIX-compliant, Mac, and Windows) which would allow something like your rlimit(..) calls below. We'd really need something to work on all platforms ideally, to be used by R package maintainers and possibly even better by R itself at startup, setting a reasonable memory cap - which the user could raise even to +Inf (or lower even more). Martin - Final notes about RAppArmor, not relevant to the main thread topic : Note: I'm working in pretty well maintained Fedora Linux env, but Apparmor is not only not activated, but even not available. OTOH, usting RLIMIT / gerlimit on Linux is very generally available. As a consequence, the three last lines of > require(RAppArmor) Loading required package: RAppArmor Loading required package: tools Failed to lookup process confinement: AppArmor not supported on this system Have a look at: sudo aa-status were very confusing to me: My conclusion was I could not use the RAppArmor package. (But that's wrong: For the rlimit*() functions below, one do *NOT* need an AppArmor-enabled version of Linux !) >>> library(RAppArmor) >>> rlimit_as(1e9) >>> rnorm(1e9) >> Error: cannot allocate vector of size 7.5 Gb >> >> The RAppArmor package has many other utilities to protect your server >> such from a mis-behaving process such as limiting cpu time >> (RLIMIT_CPU), fork bombs (RLIMIT_NPROC) and file sizes (RLIMIT_FSIZE). >> >> [1] http://linux.die.net/man/2/getrlimit and from my current explorations I gather that all of these are *not* Apparmor related... so could/should maybe rather migrate into a lightweight package not mentioning AppArmor ? __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] where to send patches to R source code
> > on Wed, 11 May 2016 23:00:20 -0700 writes: > Dear R Developers, > I wrote to this list a week ago with some patches that fix bugs in R's > GNU Readline interface, but I haven't had a reply. I'm not in a hurry > but I'd like to make sure that my message is getting read by the right > people. Should I be sending my patches somewhere else? Thank you Frederick for your reports and patches. You did send them to the correct place, https://bugs.r-project.org/ Sometimes (as here) a combination of circumstances do lead to nobody picking them up quickly. In this case, - probably none of R-core use or even have easy access to Arch Linux so we cannot easily veryify that there is a bug at all nor -consequently- veryify that your patch does fix the bug. - no other user has confirmed the bug on his/her platform, so there did not seem a huge demand... - Accidentally many in R core may be busy with other bugs, teaching, . and just lack the extra resources to delve into these problems at the current moment. Hence, there was not even an 'Acknowledged' change to your reports--indeed as nobody had been able to see there is a problem existing outside of your personal computer. I agree that this must seem a bit frustrating to you. -- Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Too many spaces in deparsed complex numbers with digits17 control option
> Richard Cotton > on Thu, 5 May 2016 09:37:42 +0300 writes: > If you set the "digits17" control option in deparse, you get a lot of > unnecessary space in the representation of complex numbers. > > deparse(0 + 0i) > [1] "0+0i" > > deparse(0 + 0i, control = "digits17") > [1] "0 + 0i" > As far as I can tell, the logic for this comes from this piece of > /src/main/deparse.c: > if (TYPEOF(vector) == CPLXSXP && (d->opts & DIGITS16)) { > Rcomplex z = COMPLEX(vector)[i]; > if (R_FINITE(z.r) && R_FINITE(z.i)) { > snprintf(hex, 64, "%.17g + %17gi", z.r, z.i); > strp = hex; > } else > strp = EncodeElement(vector, i, quote, '.'); > } > I think this is a small bug, and that "%17gi" in the snprintf call > ought to be "%.17gi". > Also there shouldn't be any space around the plus sign for consistency > with the non-digits17 option. > Is this a real bug, or is it deliberate behaviour? > -- > Regards, > Richie Thank you, Richie! I agree it should be improved ... actually, there is even another improvement, so we don't get things like '2+-1i' but rather '2-1i' (namely to use the '+' format modifier option for printf). I have commited a change to R-devel (only, for now), svn rev 70601 specifically. Martin -- Martin http://stat.ethz.ch/people/maechler Seminar für Statistik, ETH Zürich HG G 16 Rämistrasse 101 CH-8092 Zurich, SWITZERLAND phone: +41-44-632-3408 fax: ...-1228 <>< __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R process killed when allocating too large matrix (Mac OS X)
On Thu, May 12, 2016 at 9:51 AM, Martin Maechler wrote: > My conclusion was I could not use the RAppArmor package. > > (But that's wrong: For the rlimit*() functions below, one do > *NOT* need an AppArmor-enabled version of Linux !) Yes, it is a relatively recent (unadvertised) feature that the package now builds on linux systems without libapparmor. I agree this names the package name confusing. I'll make at least that warning more informative. Some background: When I started the package (5 years ago) I expected that soon all linux distributions would have the apparmor module which has been in the kernel since 2.6.36. However Redhat is explicitly disabling it because they are pushing a competing MAC system (SELinux) which they develop together with the NSA, and they really want you to use this instead (and only this!). > I gather that all of these are *not* Apparmor related... so could/should > maybe rather migrate into a lightweight package not mentioning AppArmor ? I agree, it has been on the to do list for a while; Kirill and me were talking yesterday about what would be the best route to take: - A small package with only the rlimit bindings - or: A 'linux' package with bindings to anything in the kernel, including rlimit, but possibly other system tools. - or: A package targeting POSIX/unix with standard functionality that is also available on OSX/BSD. >From my experience, windows is pretty useless for this kind of stuff. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R process killed when allocating too large matrix (Mac OS X)
On Thu, 12 May 2016 12:07:29 +0200 Jeroen Ooms wrote: > On Thu, May 12, 2016 at 9:51 AM, Martin Maechler > wrote: > > My conclusion was I could not use the RAppArmor package. > > > > (But that's wrong: For the rlimit*() functions below, one do > > *NOT* need an AppArmor-enabled version of Linux !) > > Yes, it is a relatively recent (unadvertised) feature that the package > now builds on linux systems without libapparmor. I agree this names > the package name confusing. I'll make at least that warning more > informative. > > Some background: When I started the package (5 years ago) I expected > that soon all linux distributions would have the apparmor module which > has been in the kernel since 2.6.36. However Redhat is explicitly > disabling it because they are pushing a competing MAC system (SELinux) > which they develop together with the NSA, and they really want you to > use this instead (and only this!). > > > I gather that all of these are *not* Apparmor related... so > > could/should maybe rather migrate into a lightweight package not > > mentioning AppArmor ? > > I agree, it has been on the to do list for a while; Kirill and me were > talking yesterday about what would be the best route to take: > > - A small package with only the rlimit bindings > - or: A 'linux' package with bindings to anything in the kernel, > including rlimit, but possibly other system tools. > - or: A package targeting POSIX/unix with standard functionality that > is also available on OSX/BSD. > > >From my experience, windows is pretty useless for this kind of > >stuff. Maybe not so useless after reading [1] about computationally querying the OSes on the available memory and [2] about pushing the OSes' limits. The latter page is part of a series where each topic is valuable on its own. [1] https://msdn.microsoft.com/en-us/library/aa366778.aspx [2] https://blogs.technet.microsoft.com/markrussinovich/2008/07/21/pushing-the-limits-of-windows-physical-memory/ > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Suggestion: mkString(NULL) should be NA
I would like to propose that Rf_mkString(NULL) and Rf_mkChar(NULL) return NA rather than segfault. Case: the mkString() and mkChar() functions are convenient to wrap strings returned by e.g. external C libraries into an R vector. However sometimes a library returns NULL instead of a string when the result is unavailable. In some C libraries this can happen unexpectedly or is even undocumented. A good R package author always checks results for a null pointer, and deals with it accordingly. But sometimes we make assumptions. There was an example in the 'curl' package where a documented version string was suddenly NULL if libcurl was built with some unusual configuration. These problems are hard to catch and I don't see any benefit of segfaulting for such edge cases. Some packages use a macro like this to protect against such problems: #define make_string(x) x ? Rf_mkString(x) : ScalarString(NA_STRING) But I think it would make sense if this was the default behavior in Rf_mkString and Rf_mkChar. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Single-threaded aspect
R Developers, Could someone help explain what it means that R is single threaded? I am trying to understand what is actually going on inside R when users want to parallelize code. For example, using mclapply or foreach (with some backend) somehow allows users to benefit from multiple CPUs. Similarly there is the RcppParallel package for RMatrix/RVector objects. But none of these address the general XPtr objects in Rcpp. Some readers here may recognize my question on SO ( http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr) where I was curious about parallel calls to C++/Rcpp functions that return XPtr objects. I am being a little more persistent here as this limitation provides a very hard stop on the development on one of my packages that heavily uses XPtr objects. It's not meant to be a criticism or intended to be rude, I just want to fully understand. I am willing to accept that it may be impossible currently but I want to at least understand why it is impossible so I can explain to future users why parallel functionality is not available. Which just echos my original question, what does it mean that R is single threaded? Kind Regards, Charles [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Single-threaded aspect
On 12/05/2016 8:45 AM, Charles Determan wrote: R Developers, Could someone help explain what it means that R is single threaded? I am trying to understand what is actually going on inside R when users want to parallelize code. For example, using mclapply or foreach (with some backend) somehow allows users to benefit from multiple CPUs. I don't know what document you are quoting when you say "R is single threaded", but one possible meaning is that most base R calculations are done in a single thread. When you do vectorized calculations like x+y for long vectors x and y, they are done internally as loops over the entries. On Windows, there are two threads when running Rterm, with one to maintain the display, since otherwise the plot display couldn't update while R is waiting for input. The mclapply function in the parallel package forks the process to do its calculations. Other packages can do other variations on parallel computations. I can't help you with the rest of your question, I don't know what XPtr objects are. Duncan Murdoch Similarly there is the RcppParallel package for RMatrix/RVector objects. But none of these address the general XPtr objects in Rcpp. Some readers here may recognize my question on SO ( http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr) where I was curious about parallel calls to C++/Rcpp functions that return XPtr objects. I am being a little more persistent here as this limitation provides a very hard stop on the development on one of my packages that heavily uses XPtr objects. It's not meant to be a criticism or intended to be rude, I just want to fully understand. I am willing to accept that it may be impossible currently but I want to at least understand why it is impossible so I can explain to future users why parallel functionality is not available. Which just echos my original question, what does it mean that R is single threaded? Kind Regards, Charles [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Single-threaded aspect
Charles, 1. Perhaps this question is better directed at the R-help or R-pacakge-devel mailinglist. 2. It basically means that R itself can only evaluate one R expression at the time. The parallel package circumvents this by starting multiple R-sessions and dividing workload. Compiled code called by R (such as C++ code through RCpp or C-code through base R's interface) can execute multi-threaded code for internal purposes, using e.g. openMP. A limitation is that compiled code cannot call R's C API from multiple threads (in many cases). For example, it is not thread-safe to create R-variables from multiple threads running in C. (R's variable administration is such that the order of (un)making them from compiled code matters). I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk provided answers about that in your SO-question. Best, Mark Op do 12 mei 2016 om 14:46 schreef Charles Determan : > R Developers, > > Could someone help explain what it means that R is single threaded? I am > trying to understand what is actually going on inside R when users want to > parallelize code. For example, using mclapply or foreach (with some > backend) somehow allows users to benefit from multiple CPUs. > > Similarly there is the RcppParallel package for RMatrix/RVector objects. > But none of these address the general XPtr objects in Rcpp. Some readers > here may recognize my question on SO ( > > http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr > ) > where I was curious about parallel calls to C++/Rcpp functions that return > XPtr objects. I am being a little more persistent here as this limitation > provides a very hard stop on the development on one of my packages that > heavily uses XPtr objects. It's not meant to be a criticism or intended to > be rude, I just want to fully understand. > > I am willing to accept that it may be impossible currently but I want to at > least understand why it is impossible so I can explain to future users why > parallel functionality is not available. Which just echos my original > question, what does it mean that R is single threaded? > > Kind Regards, > Charles > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Single-threaded aspect
Thanks for the replies. Regarding the answer by Dirk, I didn't feel like I still understood the reasoning why mclapply or foreach cannot handle XPtr objects. Instead of cluttering the SO question with comments I was getting the impression that this was a limitation inherited with R objects (which XPtr is supposed to be a proxy for an R object according to Dirk's comment). If this is not the case, I could repost this on Rcpp-devel unless it could be migrated. Regards, Charles On Thu, May 12, 2016 at 8:11 AM, Mark van der Loo wrote: > Charles, > > 1. Perhaps this question is better directed at the R-help or > R-pacakge-devel mailinglist. > > 2. It basically means that R itself can only evaluate one R expression at > the time. > > The parallel package circumvents this by starting multiple R-sessions and > dividing workload. > > Compiled code called by R (such as C++ code through RCpp or C-code through > base R's interface) can execute multi-threaded code for internal purposes, > using e.g. openMP. A limitation is that compiled code cannot call R's C API > from multiple threads (in many cases). For example, it is not thread-safe > to create R-variables from multiple threads running in C. (R's variable > administration is such that the order of (un)making them from compiled code > matters). > > I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk > provided answers about that in your SO-question. > > Best, > Mark > > > > > > > > > > > Op do 12 mei 2016 om 14:46 schreef Charles Determan >: > >> R Developers, >> >> Could someone help explain what it means that R is single threaded? I am >> trying to understand what is actually going on inside R when users want to >> parallelize code. For example, using mclapply or foreach (with some >> backend) somehow allows users to benefit from multiple CPUs. >> >> Similarly there is the RcppParallel package for RMatrix/RVector objects. >> But none of these address the general XPtr objects in Rcpp. Some readers >> here may recognize my question on SO ( >> >> http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr >> ) >> where I was curious about parallel calls to C++/Rcpp functions that return >> XPtr objects. I am being a little more persistent here as this limitation >> provides a very hard stop on the development on one of my packages that >> heavily uses XPtr objects. It's not meant to be a criticism or intended >> to >> be rude, I just want to fully understand. >> >> I am willing to accept that it may be impossible currently but I want to >> at >> least understand why it is impossible so I can explain to future users why >> parallel functionality is not available. Which just echos my original >> question, what does it mean that R is single threaded? >> >> Kind Regards, >> Charles >> >> [[alternative HTML version deleted]] >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] where to send patches to R source code
> On 12 May 2016, at 10:03 , Martin Maechler wrote: > >> >>on Wed, 11 May 2016 23:00:20 -0700 writes: > >> Dear R Developers, >> I wrote to this list a week ago with some patches that fix bugs in R's >> GNU Readline interface, but I haven't had a reply. I'm not in a hurry >> but I'd like to make sure that my message is getting read by the right >> people. Should I be sending my patches somewhere else? > > Thank you Frederick for your reports and patches. > You did send them to the correct place, https://bugs.r-project.org/ > > Sometimes (as here) a combination of circumstances do lead to > nobody picking them up quickly. > In this case, > > - probably none of R-core use or even have easy access to Arch Linux > so we cannot easily veryify that there is a bug at all > nor -consequently- veryify that your patch does fix the bug. Actually, the bugs look like they should apply fairly generally, just maybe not bothering all that many people. But there could be portability issues with the fixes, so I suspect some of us were waiting for "a readline expert" to check them out. -pd BTW: Anyone with a fix for the stuck-at-eol issue? (aaabbbccc) > > - no other user has confirmed the bug on his/her platform, so > there did not seem a huge demand... > > - Accidentally many in R core may be busy with other bugs, teaching, . > and just lack the extra resources to delve into these problems > at the current moment. > > Hence, there was not even an 'Acknowledged' change to your > reports--indeed as nobody had been able to see there is a problem > existing outside of your personal computer. > > I agree that this must seem a bit frustrating to you. > > -- > Martin > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Single-threaded aspect
As others said XPtr is not something in R so Rcpp mailing list would be the right place for that aspect. However, it you forget Rcpp and phrase it as an R question, you also get much closer to the reason and answer. SEXP type is the internal representation of all objects in R. I assume your question is which operations in the R API on those are thread-safe. The answer is that most of them are not, the main reason being that the memory management is not thread-safe, i.e. you cannot allocate anything without synchronization. Since almost all API calls involve some memory allocations, they are not thread-safe. You can, however, allocate objects and the operate on their payload, e.g., you can get numerical input vectors, allocate the result vector and then perform your threaded computation in C on those, synchronize and get back - that's how most implicit parallel operations in R work (leveraging BLAS, OpenMP, etc.). That is also what Dirk replied in your SO answer (quote: "Packages like RcppParallel are very careful about using non-R data structures for multithreaded work."). Note! that the payload of most native vectors (integer, real, complex) is technically non-R data structure in the sense so you can operate on those directly (some read-only operations are also thread-safe in the API as long as they can't trigger errors/warning/side-effects). For completeness, memory allocation is not the only reason or obstacle for thread-safe R API calls, but a main one. Other issues involve error handling (you may long-jump out of your thread stack) and global state (devices, connections etc.). In short, it's not something that can be really solved without complete re-design and re-write. Cheers, Simon > On May 12, 2016, at 9:16 AM, Charles Determan wrote: > > Thanks for the replies. Regarding the answer by Dirk, I didn't feel like I > still understood the reasoning why mclapply or foreach cannot handle XPtr > objects. Instead of cluttering the SO question with comments I was getting > the impression that this was a limitation inherited with R objects (which > XPtr is supposed to be a proxy for an R object according to Dirk's > comment). If this is not the case, I could repost this on Rcpp-devel > unless it could be migrated. > > Regards, > Charles > > On Thu, May 12, 2016 at 8:11 AM, Mark van der Loo > wrote: > >> Charles, >> >> 1. Perhaps this question is better directed at the R-help or >> R-pacakge-devel mailinglist. >> >> 2. It basically means that R itself can only evaluate one R expression at >> the time. >> >> The parallel package circumvents this by starting multiple R-sessions and >> dividing workload. >> >> Compiled code called by R (such as C++ code through RCpp or C-code through >> base R's interface) can execute multi-threaded code for internal purposes, >> using e.g. openMP. A limitation is that compiled code cannot call R's C API >> from multiple threads (in many cases). For example, it is not thread-safe >> to create R-variables from multiple threads running in C. (R's variable >> administration is such that the order of (un)making them from compiled code >> matters). >> >> I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk >> provided answers about that in your SO-question. >> >> Best, >> Mark >> >> >> >> >> >> >> >> >> >> >> Op do 12 mei 2016 om 14:46 schreef Charles Determan >> : >> >>> R Developers, >>> >>> Could someone help explain what it means that R is single threaded? I am >>> trying to understand what is actually going on inside R when users want to >>> parallelize code. For example, using mclapply or foreach (with some >>> backend) somehow allows users to benefit from multiple CPUs. >>> >>> Similarly there is the RcppParallel package for RMatrix/RVector objects. >>> But none of these address the general XPtr objects in Rcpp. Some readers >>> here may recognize my question on SO ( >>> >>> http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr >>> ) >>> where I was curious about parallel calls to C++/Rcpp functions that return >>> XPtr objects. I am being a little more persistent here as this limitation >>> provides a very hard stop on the development on one of my packages that >>> heavily uses XPtr objects. It's not meant to be a criticism or intended >>> to >>> be rude, I just want to fully understand. >>> >>> I am willing to accept that it may be impossible currently but I want to >>> at >>> least understand why it is impossible so I can explain to future users why >>> parallel functionality is not available. Which just echos my original >>> question, what does it mean that R is single threaded? >>> >>> Kind Regards, >>> Charles >>> >>>[[alternative HTML version deleted]] >>> >>> __ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> > > [[alternative HTML version delete
Re: [Rd] Single-threaded aspect
The R language itself has features that limit how much mulitthreading/parallel processing can be done. There are functions with side effects, such as library(), plot(), runif(), <-, and <<- and there are no mechanisms to isolate them. Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, May 12, 2016 at 5:45 AM, Charles Determan wrote: > R Developers, > > Could someone help explain what it means that R is single threaded? I am > trying to understand what is actually going on inside R when users want to > parallelize code. For example, using mclapply or foreach (with some > backend) somehow allows users to benefit from multiple CPUs. > > Similarly there is the RcppParallel package for RMatrix/RVector objects. > But none of these address the general XPtr objects in Rcpp. Some readers > here may recognize my question on SO ( > > http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr > ) > where I was curious about parallel calls to C++/Rcpp functions that return > XPtr objects. I am being a little more persistent here as this limitation > provides a very hard stop on the development on one of my packages that > heavily uses XPtr objects. It's not meant to be a criticism or intended to > be rude, I just want to fully understand. > > I am willing to accept that it may be impossible currently but I want to at > least understand why it is impossible so I can explain to future users why > parallel functionality is not available. Which just echos my original > question, what does it mean that R is single threaded? > > Kind Regards, > Charles > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Single-threaded aspect
On 12 May 2016 at 13:11, Mark van der Loo wrote: | Charles, | | 1. Perhaps this question is better directed at the R-help or | R-pacakge-devel mailinglist. | | 2. It basically means that R itself can only evaluate one R expression at | the time. | | The parallel package circumvents this by starting multiple R-sessions and | dividing workload. | | Compiled code called by R (such as C++ code through RCpp or C-code through | base R's interface) can execute multi-threaded code for internal purposes, | using e.g. openMP. A limitation is that compiled code cannot call R's C API | from multiple threads (in many cases). For example, it is not thread-safe | to create R-variables from multiple threads running in C. (R's variable | administration is such that the order of (un)making them from compiled code | matters). Well put. | I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk | provided answers about that in your SO-question. Charles seems to hang himself up completely about a small detail, failing to see the forest for the trees. There are (many) working examples of parallel (compiled) code with R. All of them stress (and I simplify here) that can you touch R objects, or call back into R, for fear of any assignment or allocation triggering an R event. R being single-threaded it cannot do this. My answer to this problem is to only use non-R data structures. That is what RcpParallel does in the actual parallel code portions in all examples -- types RVector and RMatrix do NOT connect back to R. There are several working examples. That is also what the OpenMP examples at the Rcpp Gallery do. Charles seems to be replying 'but I use XPtr' or 'I use XPtr on arma::mat or Eigen::Matrixxd' and seems to forget that these are proxy objects to SEXPs. XPtr just wrap the SEXP for external pointers; Arma's and Eigen's matrices are performant via RcppArmadillo and RcppEigen because we use R memory via proxies. All of that is 'too close to R' for comfort. So the short answer is: enter compiled code from R, set a mutex (either conceptually or explicitly), _copy_ your data in to plain C++ data structures and go to town in parallel via OpenMP and other multithreaded approaches. Then collect the result, release the mutex and move back up. I hope this help. Dirk | | Best, | Mark | | | | | | | | | | | Op do 12 mei 2016 om 14:46 schreef Charles Determan : | | > R Developers, | > | > Could someone help explain what it means that R is single threaded? I am | > trying to understand what is actually going on inside R when users want to | > parallelize code. For example, using mclapply or foreach (with some | > backend) somehow allows users to benefit from multiple CPUs. | > | > Similarly there is the RcppParallel package for RMatrix/RVector objects. | > But none of these address the general XPtr objects in Rcpp. Some readers | > here may recognize my question on SO ( | > | > http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr | > ) | > where I was curious about parallel calls to C++/Rcpp functions that return | > XPtr objects. I am being a little more persistent here as this limitation | > provides a very hard stop on the development on one of my packages that | > heavily uses XPtr objects. It's not meant to be a criticism or intended to | > be rude, I just want to fully understand. | > | > I am willing to accept that it may be impossible currently but I want to at | > least understand why it is impossible so I can explain to future users why | > parallel functionality is not available. Which just echos my original | > question, what does it mean that R is single threaded? | > | > Kind Regards, | > Charles | > | > [[alternative HTML version deleted]] | > | > __ | > R-devel@r-project.org mailing list | > https://stat.ethz.ch/mailman/listinfo/r-devel | > | | [[alternative HTML version deleted]] | | __ | R-devel@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-devel -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Single-threaded aspect
Thank you Simon for the detailed reply. That explains much more of what I was looking for from the R side. Dirk, I'm sorry if I seem hung up on anything here but I am trying to understand the details. My reply about XPtr or XPtr on arma/Eigen was to confirm my understanding was correct, which it appears it was. I was not aware the RVector/RMatrix objects don't connect to R as I am just now familiarizing myself with the package, that explains more of my confusion. I will look at doing work within the compiled code as you have suggested. Regards, Charles On Thu, May 12, 2016 at 9:18 AM, Dirk Eddelbuettel wrote: > > On 12 May 2016 at 13:11, Mark van der Loo wrote: > | Charles, > | > | 1. Perhaps this question is better directed at the R-help or > | R-pacakge-devel mailinglist. > | > | 2. It basically means that R itself can only evaluate one R expression at > | the time. > | > | The parallel package circumvents this by starting multiple R-sessions and > | dividing workload. > | > | Compiled code called by R (such as C++ code through RCpp or C-code > through > | base R's interface) can execute multi-threaded code for internal > purposes, > | using e.g. openMP. A limitation is that compiled code cannot call R's C > API > | from multiple threads (in many cases). For example, it is not thread-safe > | to create R-variables from multiple threads running in C. (R's variable > | administration is such that the order of (un)making them from compiled > code > | matters). > > Well put. > > | I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk > | provided answers about that in your SO-question. > > Charles seems to hang himself up completely about a small detail, failing > to > see the forest for the trees. > > There are (many) working examples of parallel (compiled) code with R. All > of > them stress (and I simplify here) that can you touch R objects, or call > back > into R, for fear of any assignment or allocation triggering an R event. R > being single-threaded it cannot do this. > > My answer to this problem is to only use non-R data structures. That is > what > RcpParallel does in the actual parallel code portions in all examples -- > types RVector and RMatrix do NOT connect back to R. There are several > working > examples. That is also what the OpenMP examples at the Rcpp Gallery do. > > Charles seems to be replying 'but I use XPtr' or 'I use XPtr on arma::mat > or > Eigen::Matrixxd' and seems to forget that these are proxy objects to SEXPs. > XPtr just wrap the SEXP for external pointers; Arma's and Eigen's matrices > are performant via RcppArmadillo and RcppEigen because we use R memory via > proxies. All of that is 'too close to R' for comfort. > > So the short answer is: enter compiled code from R, set a mutex (either > conceptually or explicitly), _copy_ your data in to plain C++ data > structures > and go to town in parallel via OpenMP and other multithreaded approaches. > Then collect the result, release the mutex and move back up. > > I hope this help. > > Dirk > > | > | Best, > | Mark > | > | > | > | > | > | > | > | > | > | > | Op do 12 mei 2016 om 14:46 schreef Charles Determan < > cdeterma...@gmail.com>: > | > | > R Developers, > | > > | > Could someone help explain what it means that R is single threaded? I > am > | > trying to understand what is actually going on inside R when users > want to > | > parallelize code. For example, using mclapply or foreach (with some > | > backend) somehow allows users to benefit from multiple CPUs. > | > > | > Similarly there is the RcppParallel package for RMatrix/RVector > objects. > | > But none of these address the general XPtr objects in Rcpp. Some > readers > | > here may recognize my question on SO ( > | > > | > > http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr > | > ) > | > where I was curious about parallel calls to C++/Rcpp functions that > return > | > XPtr objects. I am being a little more persistent here as this > limitation > | > provides a very hard stop on the development on one of my packages that > | > heavily uses XPtr objects. It's not meant to be a criticism or > intended to > | > be rude, I just want to fully understand. > | > > | > I am willing to accept that it may be impossible currently but I want > to at > | > least understand why it is impossible so I can explain to future users > why > | > parallel functionality is not available. Which just echos my original > | > question, what does it mean that R is single threaded? > | > > | > Kind Regards, > | > Charles > | > > | > [[alternative HTML version deleted]] > | > > | > __ > | > R-devel@r-project.org mailing list > | > https://stat.ethz.ch/mailman/listinfo/r-devel > | > > | > | [[alternative HTML version deleted]] > | > | __ > | R-devel@r-project.org mailing list > | https://stat.ethz.ch/mailman/listinfo/r-devel >
Re: [Rd] Single-threaded aspect
On 12 May 2016 at 09:18, Dirk Eddelbuettel wrote: | | On 12 May 2016 at 13:11, Mark van der Loo wrote: | | Charles, | | | | 1. Perhaps this question is better directed at the R-help or | | R-pacakge-devel mailinglist. | | | | 2. It basically means that R itself can only evaluate one R expression at | | the time. | | | | The parallel package circumvents this by starting multiple R-sessions and | | dividing workload. | | | | Compiled code called by R (such as C++ code through RCpp or C-code through | | base R's interface) can execute multi-threaded code for internal purposes, | | using e.g. openMP. A limitation is that compiled code cannot call R's C API | | from multiple threads (in many cases). For example, it is not thread-safe | | to create R-variables from multiple threads running in C. (R's variable | | administration is such that the order of (un)making them from compiled code | | matters). | | Well put. | | | I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk | | provided answers about that in your SO-question. | | Charles seems to hang himself up completely about a small detail, failing to | see the forest for the trees. | | There are (many) working examples of parallel (compiled) code with R. All of | them stress (and I simplify here) that can you touch R objects, or call back An import 'not' missing here (and a reordering); "that you CANNOT touch R objects" Sorry, Dirk | into R, for fear of any assignment or allocation triggering an R event. R | being single-threaded it cannot do this. | | My answer to this problem is to only use non-R data structures. That is what | RcpParallel does in the actual parallel code portions in all examples -- | types RVector and RMatrix do NOT connect back to R. There are several working | examples. That is also what the OpenMP examples at the Rcpp Gallery do. | | Charles seems to be replying 'but I use XPtr' or 'I use XPtr on arma::mat or | Eigen::Matrixxd' and seems to forget that these are proxy objects to SEXPs. | XPtr just wrap the SEXP for external pointers; Arma's and Eigen's matrices | are performant via RcppArmadillo and RcppEigen because we use R memory via | proxies. All of that is 'too close to R' for comfort. | | So the short answer is: enter compiled code from R, set a mutex (either | conceptually or explicitly), _copy_ your data in to plain C++ data structures | and go to town in parallel via OpenMP and other multithreaded approaches. | Then collect the result, release the mutex and move back up. | | I hope this help. | | Dirk | | | | | Best, | | Mark | | | | | | | | | | | | | | | | | | | | | | Op do 12 mei 2016 om 14:46 schreef Charles Determan : | | | | > R Developers, | | > | | > Could someone help explain what it means that R is single threaded? I am | | > trying to understand what is actually going on inside R when users want to | | > parallelize code. For example, using mclapply or foreach (with some | | > backend) somehow allows users to benefit from multiple CPUs. | | > | | > Similarly there is the RcppParallel package for RMatrix/RVector objects. | | > But none of these address the general XPtr objects in Rcpp. Some readers | | > here may recognize my question on SO ( | | > | | > http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr | | > ) | | > where I was curious about parallel calls to C++/Rcpp functions that return | | > XPtr objects. I am being a little more persistent here as this limitation | | > provides a very hard stop on the development on one of my packages that | | > heavily uses XPtr objects. It's not meant to be a criticism or intended to | | > be rude, I just want to fully understand. | | > | | > I am willing to accept that it may be impossible currently but I want to at | | > least understand why it is impossible so I can explain to future users why | | > parallel functionality is not available. Which just echos my original | | > question, what does it mean that R is single threaded? | | > | | > Kind Regards, | | > Charles | | > | | > [[alternative HTML version deleted]] | | > | | > __ | | > R-devel@r-project.org mailing list | | > https://stat.ethz.ch/mailman/listinfo/r-devel | | > | | | | [[alternative HTML version deleted]] | | | | __ | | R-devel@r-project.org mailing list | | https://stat.ethz.ch/mailman/listinfo/r-devel | | -- | http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org | | __ | R-devel@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-devel -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Single-threaded aspect
On 12 May 2016 at 09:25, Charles Determan wrote: | Thank you Simon for the detailed reply. That explains much more of what I was | looking for from the R side. | | Dirk, I'm sorry if I seem hung up on anything here but I am trying to | understand the details. My reply about XPtr or XPtr on arma/Eigen was to | confirm my understanding was correct, which it appears it was. I was not aware I still do not think so. Step back, have a cup of tea or two, and start with the simple and short OpenMP examples in Rcpp itself. They have been there for years and should still work. I would encourage you to work through these, maybe take notes and possibly even submit the notes as a new short piece in the Rcpp Gallery. | the RVector/RMatrix objects don't connect to R as I am just now familiarizing | myself with the package, that explains more of my confusion. I will look at | doing work within the compiled code as you have suggested. Sounds good. OpenMP and Intel TBB (as in RcppParallel) will only become more important as we move to more and more cores. Working with them is not all that obvious as you are finding out. Let's try to work to make the documentation better. Dirk | Regards, | Charles | | On Thu, May 12, 2016 at 9:18 AM, Dirk Eddelbuettel wrote: | | | On 12 May 2016 at 13:11, Mark van der Loo wrote: | | Charles, | | | | 1. Perhaps this question is better directed at the R-help or | | R-pacakge-devel mailinglist. | | | | 2. It basically means that R itself can only evaluate one R expression at | | the time. | | | | The parallel package circumvents this by starting multiple R-sessions and | | dividing workload. | | | | Compiled code called by R (such as C++ code through RCpp or C-code | through | | base R's interface) can execute multi-threaded code for internal | purposes, | | using e.g. openMP. A limitation is that compiled code cannot call R's C | API | | from multiple threads (in many cases). For example, it is not thread-safe | | to create R-variables from multiple threads running in C. (R's variable | | administration is such that the order of (un)making them from compiled | code | | matters). | | Well put. | | | I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk | | provided answers about that in your SO-question. | | Charles seems to hang himself up completely about a small detail, failing | to | see the forest for the trees. | | There are (many) working examples of parallel (compiled) code with R. All | of | them stress (and I simplify here) that can you touch R objects, or call | back | into R, for fear of any assignment or allocation triggering an R event. R | being single-threaded it cannot do this. | | My answer to this problem is to only use non-R data structures. That is | what | RcpParallel does in the actual parallel code portions in all examples -- | types RVector and RMatrix do NOT connect back to R. There are several | working | examples. That is also what the OpenMP examples at the Rcpp Gallery do. | | Charles seems to be replying 'but I use XPtr' or 'I use XPtr on arma::mat | or | Eigen::Matrixxd' and seems to forget that these are proxy objects to SEXPs. | XPtr just wrap the SEXP for external pointers; Arma's and Eigen's matrices | are performant via RcppArmadillo and RcppEigen because we use R memory via | proxies. All of that is 'too close to R' for comfort. | | So the short answer is: enter compiled code from R, set a mutex (either | conceptually or explicitly), _copy_ your data in to plain C++ data | structures | and go to town in parallel via OpenMP and other multithreaded approaches. | Then collect the result, release the mutex and move back up. | | I hope this help. | | Dirk | | | | | Best, | | Mark | | | | | | | | | | | | | | | | | | | | | | Op do 12 mei 2016 om 14:46 schreef Charles Determan < | cdeterma...@gmail.com>: | | | | > R Developers, | | > | | > Could someone help explain what it means that R is single threaded? I | am | | > trying to understand what is actually going on inside R when users want | to | | > parallelize code. For example, using mclapply or foreach (with some | | > backend) somehow allows users to benefit from multiple CPUs. | | > | | > Similarly there is the RcppParallel package for RMatrix/RVector | objects. | | > But none of these address the general XPtr objects in Rcpp. Some | readers | | > here may recognize my question on SO ( | | > | | > http://stackoverflow.com/questions/37167479/ | rcpp-parallelize-functions-that-return-xptr | | > ) | | > where I was curious about parallel calls to C++/Rcpp functions that | return | | > XPtr
Re: [Rd] where to send patches to R source code
Hi Peter, Martin, and others, Thanks for your replies. - The bugs apply to all systems that use GNU Readline, not just Linux or Arch Linux. - Readline version 6.3 changed the signal handling so that SIGWINCH is no longer handled automatically by the library. This means it's not currently possible for people using R on e.g. Linux to resize the terminal, or at least when they do so they have to make sure that all their commands fit in one line and don't wrap. - There is also a long-standing bug in Readline where the callback interface didn't properly clear the line on SIGINT (^C). This means that "exiting" reverse-incremental-search with ^C would give an apparently empty prompt which still had some pending input, so if you hit ^C-Return then an unintended command would get executed. If they're not "bothering all that many people", then perhaps it's because everyone uses Windows or Mac OS X or RStudio. For me these are pretty significant bugs. The second one causes unintended code to be executed. Random code could delete files, for example, or worse. The first one bites me every time I want to change the size of a window, which is pretty often. I tried to get Readline maintainer Chet Ramey to fix these on the Readline side, but he disagreed with my proposal: https://lists.gnu.org/archive/html/bug-readline/2016-04/threads.html I'm glad that my message here at least was seen and I hope that someone who uses the R command line on Linux will have time to verify that the patches work correctly. They are basically Chet-approved workarounds for bugs/changes in Readline, not very complicated. Do either of you know a Linux R person you could ping to get these patches checked out? I'm not overly frustrated, and I'm not in a major hurry, but from what we've observed it seems like waiting for someone concerned to come along and finally read Bugzilla or the R-Devel archives is not going to result in a very dense Poisson process... Thanks, Frederick Eaton On Thu, May 12, 2016 at 03:42:59PM +0200, peter dalgaard wrote: > > > On 12 May 2016, at 10:03 , Martin Maechler > > wrote: > > > >> > >>on Wed, 11 May 2016 23:00:20 -0700 writes: > > > >> Dear R Developers, > >> I wrote to this list a week ago with some patches that fix bugs in R's > >> GNU Readline interface, but I haven't had a reply. I'm not in a hurry > >> but I'd like to make sure that my message is getting read by the right > >> people. Should I be sending my patches somewhere else? > > > > Thank you Frederick for your reports and patches. > > You did send them to the correct place, https://bugs.r-project.org/ > > > > Sometimes (as here) a combination of circumstances do lead to > > nobody picking them up quickly. > > In this case, > > > > - probably none of R-core use or even have easy access to Arch Linux > > so we cannot easily veryify that there is a bug at all > > nor -consequently- veryify that your patch does fix the bug. > > Actually, the bugs look like they should apply fairly generally, just maybe > not bothering all that many people. But there could be portability issues > with the fixes, so I suspect some of us were waiting for "a readline expert" > to check them out. > > -pd > > BTW: Anyone with a fix for the stuck-at-eol issue? (aaabbbccc) > > > > > - no other user has confirmed the bug on his/her platform, so > > there did not seem a huge demand... > > > > - Accidentally many in R core may be busy with other bugs, teaching, . > > and just lack the extra resources to delve into these problems > > at the current moment. > > > > Hence, there was not even an 'Acknowledged' change to your > > reports--indeed as nobody had been able to see there is a problem > > existing outside of your personal computer. > > > > I agree that this must seem a bit frustrating to you. > > > > -- > > Martin > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd@cbs.dk Priv: pda...@gmail.com > > > > > > > > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] objects intermittently shared between vignettes
Hello: Is it widely known that objects not explicitly deleted from one vignette in a package can be available to a second in "R CMD build" and "R CMD check" but not when the second vignette is built manually, at least under RStudio on Mac OS X 10.11.4 using R 3.3.0? For an example, see install.packages("pkgW2vignettes", repos="http://R-Forge.R-project.org";) This package contains two toy vignettes. This is not a big deal, but it spooked me, when I saw 227 objects listed by "objects()" in a vignette built by "R CMD build" + "R CMD check", when the same "objects()" command only listed 4 objects when I built the same vignette by itself from within RStudio. Enjoy, Spencer Graves __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel