Re: [Rd] Holding a large number of SEXPs in C++

Simon Urbanek Mon, 03 Nov 2014 18:28:02 -0800

On Nov 3, 2014, at 5:34 PM, Simon Knapp <sleepingw...@gmail.com> wrote:


> Thanks again Simon. I had realised that R_NilValue didn't need protection... 
> I just thought it a clean way to make my initial call to PROTECT_WITH_INDEX 
> (which I can see now was not required since I didn't need the calls to 
> REPROTECT)... and I had not thought of appending to the tail.
> 
> One final question (and hopefully I don't get to badly burnt) I cannot find 
> R_PreserveObject/R_ReleaseObject or SETCDR mentioned in "Writing R 
> Extensions". Is there anywhere for a novice like myself to find a 'complete' 
> reference to Rs useful macros and functions, or do I just have to read more 
> source?
> 

R_PreserveObject is mentioned in 5.9.1, but it's really just a fleeting 
mention. It's not used in typical packages, but it is used heavily whenever 
you're interfacing a foreign runtime system (language or library). What is or 
is not in the API can vary slightly depending on whom you ask, but the 
installed header files are essentially the candidate set.

Cheers,
Simon



> Thanks again for being so awesome,
> Simon
> 
> On Tue, Nov 4, 2014 at 12:47 AM, Simon Urbanek <simon.urba...@r-project.org> 
> wrote:
> 
> On Nov 2, 2014, at 10:55 PM, Simon Knapp <sleepingw...@gmail.com> wrote:
> 
> > Thanks Simon and sorry for taking so long to give this a go. I had thought 
> > of pair lists but got confused about how to protect the top level object 
> > only, as it seems that appending requires creating a new "top-level 
> > object". The following example seems to work (full example at 
> > https://gist.github.com/Sleepingwell/8588c5ee844ce0242d05). Is this the way 
> > you would do it (or at least 'a correct' way)?
> >
> 
> You can simply append to a pairlist, so you only need to protect the head. 
> Also note that R_NilValue is a constant (in R sense, not C sense) so it 
> doesn't need protection. I would write a generic pairlist builder something 
> like that:
> 
> SEXP head = R_NilValue, tail;
> 
> void append(SEXP x) {
>   if (head == R_NilValue)
>         R_PreserveObject(head = tail = CONS(x, R_NilValue));
>   else
>         tail = SETCDR(tail, CONS(x, R_NilValue));
> }
> 
> void destroy() {
>    if (head != R_NilValue)
>         R_ReleaseObject(head);
> }
> 
> Cheers,
> Simon
> 
> 
> >
> >
> > struct PolyHolder {
> >     PolyHolder(void) {
> >         PROTECT_WITH_INDEX(currentRegion = R_NilValue, &icr);
> >         PROTECT_WITH_INDEX(regions = R_NilValue, &ir);
> >     }
> >
> >     ~PolyHolder(void) {
> >         UNPROTECT(2);
> >     }
> >
> >     void notifyEndRegion(void) {
> >         REPROTECT(regions = CONS(makePolygonsFromPairList(currentRegion), 
> > regions), ir);
> >         REPROTECT(currentRegion = R_NilValue, icr);
> >     }
> >
> >     template<typename Iter>
> >     void addSubPolygon(Iter b, Iter e) {
> >         REPROTECT(currentRegion = CONS(makePolygon(b, e), currentRegion), 
> > icr);
> >     }
> >
> >     SEXP getPolygons(void) {
> >         return regions;
> >     }
> >
> > private:
> >     PROTECT_INDEX
> >         ir,
> >         icr;
> >
> >     SEXP
> >         currentRegion,
> >         regions;
> > };
> >
> >
> >
> > Thanks again,
> > Simon Knapp
> >
> >
> >
> > CONS(newPoly, creates a new object
> > On Sat, Oct 18, 2014 at 2:10 AM, Simon Urbanek 
> > <simon.urba...@r-project.org> wrote:
> >
> > On Oct 17, 2014, at 7:31 AM, Simon Knapp <sleepingw...@gmail.com> wrote:
> >
> > > Background:
> > > I have an algorithm which produces a large number of small polygons (of 
> > > the
> > > spatial kind) which I would like to use within R using objects from sp. I
> > > can't predict the exact number of polygons a-priori, the polygons will be
> > > grouped into regions, and each region will be filled sequentially, so an
> > > appropriate C++ 'framework' (for the point of illustration) might be:
> > >
> > > typedef std::pair<double, double> Point;
> > > typedef std::vector<Point> Polygon;
> > > typedef std::vector<Polygon> Polygons;
> > > typedef std::vector<Polygons> Regions;
> > >
> > > struct Holder {
> > >    void notifyNewRegion(void) const {
> > >        regions.push_back(Polygons());
> > >    }
> > >
> > >    template<typename Iter>
> > >    void addSubPoly(Iter b, Iter e) {
> > >        regions.back().push_back(Polygon(b, e));
> > >    }
> > >
> > > private:
> > >    Regions regions;
> > > };
> > >
> > > where the reference_type of Iter is convertible to Point. In practice I 
> > > use
> > > pointers in a couple of places to avoid resizing in push_back becoming too
> > > expensive.
> > >
> > > To construct the corresponding sp::Polygon, sp::Polygons and
> > > sp::SpatialPolygons at the end of the algorithm, I iterate over the result
> > > turning each Polygon into a two column matrix and calling the C functions
> > > corresponding to the 'constructors' for these objects.
> > >
> > > This is all working fine, but I could cut my memory consumption in half if
> > > I could construct the sp::Polygon objects in addSubPoly, and the
> > > sp::Polygons objects in notifyNewRegion. My vector typedefs would then all
> > > be:
> > >
> > > typedef std::vector<SEXP>
> > >
> > >
> > >
> > >
> > > Question:
> > > What I'm not sure about (and finally my question) is: I will have datasets
> > > where I have more than 10,000 SEXPs in the Polygon and Polygons objects 
> > > for
> > > a single region, and possibly more than 10,000 regions, so how do I 
> > > PROTECT
> > > all those SEXPs (noting that the protection stack is limited to 10,000 and
> > > bearing in mind that I don't know how many there will be before I start)?
> > >
> > > I am also interested in this just out of general curiosity.
> > >
> > >
> > >
> > >
> > > Thoughts:
> > >
> > > 1) I could create an environment and store the objects themselves in there
> > > while keeping pointers in the vectors, but am not sure if this would be
> > > that efficient (guidance would be appreciated), or
> > >
> > > 2) Just keep them in R vectors and grow these myself (as push_back is 
> > > doing
> > > for me in the above), but that sounds like a pain and I'm not sure if the
> > > objects or just the pointers would be copied when I reassigned things
> > > (guidance would be appreciated again). Bare in mind that I keep pointers 
> > > in
> > > the vectors, but omitted that for the sake of clarity.
> > >
> > >
> > >
> > >
> > > Is there some other R type that would be suited to this, or a general
> > > approach?
> > >
> >
> > Lists in R (LISTSXP aka pairlists) are suited to appending (since that is 
> > fast and trivial) and sequential processing. The only issue is that 
> > pairlists are slow for random access. If you only want to load the polygons 
> > and finalize, then you can hold them in a pairlist and at the end copy to a 
> > generic vector (if random access is expected). DB applications typically 
> > use a hybrid approach -  allocate vector blocks and keep them in pairlists, 
> > but that's probably an overkill for your use (if you really cared about 
> > performance you wouldn't use sp objects for this ;))
> >
> > Note that you only have to protect the top-level object, so you don't need 
> > to protect the individual elements.
> >
> > Cheers,
> > Simon
> >
> >
> > > Cheers and thanks in advance,
> > > Simon Knapp
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> >
> >
> 
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Holding a large number of SEXPs in C++

Reply via email to