On Apr 11, 2014, at 3:47 PM, Romain Francois <rom...@r-enthusiasts.com> wrote:
> Hello, > > I’ve been using shrinking in > https://github.com/hadley/dplyr/blob/master/inst/include/tools/ShrinkableVector.h > > This defines a ShrinkableVector of some R type (INTSXP, ...) given the > maximum number of elements it will hold. Then, I reset with SETLENGTH when > needed. The constructor protects the SEXP, and the destructor restores the > original length before removing the protection. With this I only have to > allocate the data once, and I can make R believe a vector is of a different > size. As long as I restore the correct size eventually. > I like the destructor touch of restoring the size :) - that is neat. But as I said, this is only useful in cases where you strip off a few elements, otherwise you're better off creating a copy because of the memory implications. Cheers, Simon > Kevin, when you start using parallelism, you have to change the way you > approach the sequence of things that go on. Particularly it is less of a > problem to do a double pass, i.e. one to figure out the appropriate size and > one to handle part of the data. And guess what, there is lots of that to come > in next versions of Rcpp11. > > Romain > > Le 11 avr. 2014 à 17:08, Simon Urbanek <simon.urba...@r-project.org> a écrit : > >> Kevin, >> Kevin, >> >> On Apr 10, 2014, at 4:57 PM, Kevin Ushey <kevinus...@gmail.com> wrote: >> >>> Suppose I generate an integer vector with e.g. >>> >>> SEXP iv = PROTECT(allocVector(INTSXP, 100)); >>> >>> and later want to shrink the object, e.g. >>> >>> shrink(iv, 50); >>> >>> would simply re-set the length to 50, and allow R to reclaim the >>> memory that was previously used. >>> >>> Is it possible to do this while respecting how R manages memory? >>> >> >> The short answer is, no. >> >> There are several problems with this, one of the main ones being that there >> is simply no way to release the "excess" memory, so the vector still has the >> full length in memory. There is the SETLENGTH() function, but it's not part >> of the API and it has been proposed for elimination because of the inherent >> issues it causes (discrepancy in allocated and reported length). >> >> >>> The motivation: there are many operations where the length of the >>> output is not known ahead of time, and in such cases one typically >>> uses a data structure that can grow efficiently. Unfortunately, IIUC >>> SEXPRECs cannot do this; however, an alternative possibility would >>> involve reserving extra memory, and then shrinking to fit after the >>> operation is complete. >>> >>> There have been some discussions previously that defaulted to answers >>> of the form "you should probably just copy", e.g. >>> https://stat.ethz.ch/pipermail/r-devel/2008-March/048593.html, but I >>> wanted to ping and see if others had ideas, or if perhaps there was >>> code in the R sources that might be relevant. >>> >>> Another reason why this is interesting is due to C++11 and >>> multi-threading: if I can pre-allocate SEXPs that will contain results >>> in the main thread, and then fill these SEXPs asynchronously (without >>> touching R, and hence not getting in the way of the GC or otherwise), >>> I can then fill these SEXPs in place and shrink-to-fit after the >>> computations have been completed. With C++11 support coming with R >>> 3.1.0, functionality like this is very attractive. >>> >> >> I don't see how this is related to the question - it was always possible to >> fill SEXPs from parallel threads and has been routinely used even in R >> itself (most commonly via OpenMP). >> >> >>> The obvious alternatives are to 1) determine the length of the output >>> first and hence generate SEXPs of appropriate size right off the bat >>> (potentially expensive), and 2) fill thread-safe containers and copy >>> to an R object (definitely expensive). >>> >> >> In most current OSes, it is impossible to shrink allocated memory in-place, >> so if you really don't know the size of the object, it will be copied >> anyway. As mentioned above, the only case where shrinking may work is if you >> only need to strip a few elements of a large vector so that keeping the same >> allocation has no significant effect. >> >> Cheers, >> Simon >> >> >> >> >>> I am probably missing something subtle (or obvious) as to why this may >>> not work, or be recommended, so I appreciate any comments. >>> >>> Thanks, >>> Kevin >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel