Re: [Rd] Holding a large number of SEXPs in C++

Simon Urbanek Mon, 03 Nov 2014 05:50:17 -0800

On Nov 2, 2014, at 10:55 PM, Simon Knapp <sleepingw...@gmail.com> wrote:


> Thanks Simon and sorry for taking so long to give this a go. I had thought of 
> pair lists but got confused about how to protect the top level object only, 
> as it seems that appending requires creating a new "top-level object". The 
> following example seems to work (full example at 
> https://gist.github.com/Sleepingwell/8588c5ee844ce0242d05). Is this the way 
> you would do it (or at least 'a correct' way)?
> 

You can simply append to a pairlist, so you only need to protect the head. Also 
note that R_NilValue is a constant (in R sense, not C sense) so it doesn't need 
protection. I would write a generic pairlist builder something like that:

SEXP head = R_NilValue, tail;

void append(SEXP x) {
  if (head == R_NilValue)
        R_PreserveObject(head = tail = CONS(x, R_NilValue));
  else
        tail = SETCDR(tail, CONS(x, R_NilValue));
}

void destroy() {
   if (head != R_NilValue)
        R_ReleaseObject(head);
}      

Cheers,
Simon


> 
> 
> struct PolyHolder {
>     PolyHolder(void) {
>         PROTECT_WITH_INDEX(currentRegion = R_NilValue, &icr);
>         PROTECT_WITH_INDEX(regions = R_NilValue, &ir);
>     }
> 
>     ~PolyHolder(void) {
>         UNPROTECT(2);
>     }
> 
>     void notifyEndRegion(void) {
>         REPROTECT(regions = CONS(makePolygonsFromPairList(currentRegion), 
> regions), ir);
>         REPROTECT(currentRegion = R_NilValue, icr);
>     }
> 
>     template<typename Iter>
>     void addSubPolygon(Iter b, Iter e) {
>         REPROTECT(currentRegion = CONS(makePolygon(b, e), currentRegion), 
> icr);
>     }
> 
>     SEXP getPolygons(void) {
>         return regions;
>     }
> 
> private:
>     PROTECT_INDEX
>         ir,
>         icr;
> 
>     SEXP
>         currentRegion,
>         regions;
> };
> 
> 
> 
> Thanks again,
> Simon Knapp
> 
> 
> 
> CONS(newPoly, creates a new object 
> On Sat, Oct 18, 2014 at 2:10 AM, Simon Urbanek <simon.urba...@r-project.org> 
> wrote:
> 
> On Oct 17, 2014, at 7:31 AM, Simon Knapp <sleepingw...@gmail.com> wrote:
> 
> > Background:
> > I have an algorithm which produces a large number of small polygons (of the
> > spatial kind) which I would like to use within R using objects from sp. I
> > can't predict the exact number of polygons a-priori, the polygons will be
> > grouped into regions, and each region will be filled sequentially, so an
> > appropriate C++ 'framework' (for the point of illustration) might be:
> >
> > typedef std::pair<double, double> Point;
> > typedef std::vector<Point> Polygon;
> > typedef std::vector<Polygon> Polygons;
> > typedef std::vector<Polygons> Regions;
> >
> > struct Holder {
> >    void notifyNewRegion(void) const {
> >        regions.push_back(Polygons());
> >    }
> >
> >    template<typename Iter>
> >    void addSubPoly(Iter b, Iter e) {
> >        regions.back().push_back(Polygon(b, e));
> >    }
> >
> > private:
> >    Regions regions;
> > };
> >
> > where the reference_type of Iter is convertible to Point. In practice I use
> > pointers in a couple of places to avoid resizing in push_back becoming too
> > expensive.
> >
> > To construct the corresponding sp::Polygon, sp::Polygons and
> > sp::SpatialPolygons at the end of the algorithm, I iterate over the result
> > turning each Polygon into a two column matrix and calling the C functions
> > corresponding to the 'constructors' for these objects.
> >
> > This is all working fine, but I could cut my memory consumption in half if
> > I could construct the sp::Polygon objects in addSubPoly, and the
> > sp::Polygons objects in notifyNewRegion. My vector typedefs would then all
> > be:
> >
> > typedef std::vector<SEXP>
> >
> >
> >
> >
> > Question:
> > What I'm not sure about (and finally my question) is: I will have datasets
> > where I have more than 10,000 SEXPs in the Polygon and Polygons objects for
> > a single region, and possibly more than 10,000 regions, so how do I PROTECT
> > all those SEXPs (noting that the protection stack is limited to 10,000 and
> > bearing in mind that I don't know how many there will be before I start)?
> >
> > I am also interested in this just out of general curiosity.
> >
> >
> >
> >
> > Thoughts:
> >
> > 1) I could create an environment and store the objects themselves in there
> > while keeping pointers in the vectors, but am not sure if this would be
> > that efficient (guidance would be appreciated), or
> >
> > 2) Just keep them in R vectors and grow these myself (as push_back is doing
> > for me in the above), but that sounds like a pain and I'm not sure if the
> > objects or just the pointers would be copied when I reassigned things
> > (guidance would be appreciated again). Bare in mind that I keep pointers in
> > the vectors, but omitted that for the sake of clarity.
> >
> >
> >
> >
> > Is there some other R type that would be suited to this, or a general
> > approach?
> >
> 
> Lists in R (LISTSXP aka pairlists) are suited to appending (since that is 
> fast and trivial) and sequential processing. The only issue is that pairlists 
> are slow for random access. If you only want to load the polygons and 
> finalize, then you can hold them in a pairlist and at the end copy to a 
> generic vector (if random access is expected). DB applications typically use 
> a hybrid approach -  allocate vector blocks and keep them in pairlists, but 
> that's probably an overkill for your use (if you really cared about 
> performance you wouldn't use sp objects for this ;))
> 
> Note that you only have to protect the top-level object, so you don't need to 
> protect the individual elements.
> 
> Cheers,
> Simon
> 
> 
> > Cheers and thanks in advance,
> > Simon Knapp
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> 
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Holding a large number of SEXPs in C++

Reply via email to