Re: [Rd] Is it possible to shrink an R object in place?

Simon Urbanek Fri, 11 Apr 2014 13:35:36 -0700

On Apr 11, 2014, at 3:47 PM, Romain Francois <rom...@r-enthusiasts.com> wrote:


> Hello, 
> 
> I’ve been using shrinking in 
> https://github.com/hadley/dplyr/blob/master/inst/include/tools/ShrinkableVector.h
> 
> This defines a ShrinkableVector of some R type (INTSXP, ...) given the 
> maximum number of elements it will hold. Then, I reset with SETLENGTH when 
> needed. The constructor protects the SEXP, and the destructor restores the 
> original length before removing the protection. With this I only have to 
> allocate the data once, and I can make R believe a vector is of a different 
> size. As long as I restore the correct size eventually. 
> 

I like the destructor touch of restoring the size :) - that is neat.

But as I said, this is only useful in cases where you strip off a few elements, 
otherwise you're better off creating a copy because of the memory implications.

Cheers,
Simon


> Kevin, when you start using parallelism, you have to change the way you 
> approach the sequence of things that go on. Particularly it is less of a 
> problem to do a double pass, i.e. one to figure out the appropriate size and 
> one to handle part of the data. And guess what, there is lots of that to come 
> in next versions of Rcpp11. 
> 
> Romain
> 
> Le 11 avr. 2014 à 17:08, Simon Urbanek <simon.urba...@r-project.org> a écrit :
> 
>> Kevin,
>> Kevin,
>> 
>> On Apr 10, 2014, at 4:57 PM, Kevin Ushey <kevinus...@gmail.com> wrote:
>> 
>>> Suppose I generate an integer vector with e.g.
>>> 
>>>  SEXP iv = PROTECT(allocVector(INTSXP, 100));
>>> 
>>> and later want to shrink the object, e.g.
>>> 
>>>  shrink(iv, 50);
>>> 
>>> would simply re-set the length to 50, and allow R to reclaim the
>>> memory that was previously used.
>>> 
>>> Is it possible to do this while respecting how R manages memory?
>>> 
>> 
>> The short answer is, no.
>> 
>> There are several problems with this, one of the main ones being that there 
>> is simply no way to release the "excess" memory, so the vector still has the 
>> full length in memory. There is the SETLENGTH() function, but it's not part 
>> of the API and it has been proposed for elimination because of the inherent 
>> issues it causes (discrepancy in allocated and reported length).
>> 
>> 
>>> The motivation: there are many operations where the length of the
>>> output is not known ahead of time, and in such cases one typically
>>> uses a data structure that can grow efficiently. Unfortunately, IIUC
>>> SEXPRECs cannot do this; however, an alternative possibility would
>>> involve reserving extra memory, and then shrinking to fit after the
>>> operation is complete.
>>> 
>>> There have been some discussions previously that defaulted to answers
>>> of the form "you should probably just copy", e.g.
>>> https://stat.ethz.ch/pipermail/r-devel/2008-March/048593.html, but I
>>> wanted to ping and see if others had ideas, or if perhaps there was
>>> code in the R sources that might be relevant.
>>> 
>>> Another reason why this is interesting is due to C++11 and
>>> multi-threading: if I can pre-allocate SEXPs that will contain results
>>> in the main thread, and then fill these SEXPs asynchronously (without
>>> touching R, and hence not getting in the way of the GC or otherwise),
>>> I can then fill these SEXPs in place and shrink-to-fit after the
>>> computations have been completed. With C++11 support coming with R
>>> 3.1.0, functionality like this is very attractive.
>>> 
>> 
>> I don't see how this is related to the question - it was always possible to 
>> fill SEXPs from parallel threads and has been routinely used even in R 
>> itself (most commonly via OpenMP).
>> 
>> 
>>> The obvious alternatives are to 1) determine the length of the output
>>> first and hence generate SEXPs of appropriate size right off the bat
>>> (potentially expensive), and 2) fill thread-safe containers and copy
>>> to an R object (definitely expensive).
>>> 
>> 
>> In most current OSes, it is impossible to shrink allocated memory in-place, 
>> so if you really don't know the size of the object, it will be copied 
>> anyway. As mentioned above, the only case where shrinking may work is if you 
>> only need to strip a few elements of a large vector so that keeping the same 
>> allocation has no significant effect.
>> 
>> Cheers,
>> Simon
>> 
>> 
>> 
>> 
>>> I am probably missing something subtle (or obvious) as to why this may
>>> not work, or be recommended, so I appreciate any comments.
>>> 
>>> Thanks,
>>> Kevin
>>> 
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>> 
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Is it possible to shrink an R object in place?

Reply via email to