Thanks Simon and sorry for taking so long to give this a go. I had thought of pair lists but got confused about how to protect the top level object only, as it seems that appending requires creating a new "top-level object". The following example seems to work (full example at https://gist.github.com/Sleepingwell/8588c5ee844ce0242d05). Is this the way you would do it (or at least 'a correct' way)?
struct PolyHolder { PolyHolder(void) { PROTECT_WITH_INDEX(currentRegion = R_NilValue, &icr); PROTECT_WITH_INDEX(regions = R_NilValue, &ir); } ~PolyHolder(void) { UNPROTECT(2); } void notifyEndRegion(void) { REPROTECT(regions = CONS(makePolygonsFromPairList(currentRegion), regions), ir); REPROTECT(currentRegion = R_NilValue, icr); } template<typename Iter> void addSubPolygon(Iter b, Iter e) { REPROTECT(currentRegion = CONS(makePolygon(b, e), currentRegion), icr); } SEXP getPolygons(void) { return regions; } private: PROTECT_INDEX ir, icr; SEXP currentRegion, regions; }; Thanks again, Simon Knapp CONS(newPoly, creates a new object On Sat, Oct 18, 2014 at 2:10 AM, Simon Urbanek <simon.urba...@r-project.org> wrote: > > On Oct 17, 2014, at 7:31 AM, Simon Knapp <sleepingw...@gmail.com> wrote: > > > Background: > > I have an algorithm which produces a large number of small polygons (of > the > > spatial kind) which I would like to use within R using objects from sp. I > > can't predict the exact number of polygons a-priori, the polygons will be > > grouped into regions, and each region will be filled sequentially, so an > > appropriate C++ 'framework' (for the point of illustration) might be: > > > > typedef std::pair<double, double> Point; > > typedef std::vector<Point> Polygon; > > typedef std::vector<Polygon> Polygons; > > typedef std::vector<Polygons> Regions; > > > > struct Holder { > > void notifyNewRegion(void) const { > > regions.push_back(Polygons()); > > } > > > > template<typename Iter> > > void addSubPoly(Iter b, Iter e) { > > regions.back().push_back(Polygon(b, e)); > > } > > > > private: > > Regions regions; > > }; > > > > where the reference_type of Iter is convertible to Point. In practice I > use > > pointers in a couple of places to avoid resizing in push_back becoming > too > > expensive. > > > > To construct the corresponding sp::Polygon, sp::Polygons and > > sp::SpatialPolygons at the end of the algorithm, I iterate over the > result > > turning each Polygon into a two column matrix and calling the C functions > > corresponding to the 'constructors' for these objects. > > > > This is all working fine, but I could cut my memory consumption in half > if > > I could construct the sp::Polygon objects in addSubPoly, and the > > sp::Polygons objects in notifyNewRegion. My vector typedefs would then > all > > be: > > > > typedef std::vector<SEXP> > > > > > > > > > > Question: > > What I'm not sure about (and finally my question) is: I will have > datasets > > where I have more than 10,000 SEXPs in the Polygon and Polygons objects > for > > a single region, and possibly more than 10,000 regions, so how do I > PROTECT > > all those SEXPs (noting that the protection stack is limited to 10,000 > and > > bearing in mind that I don't know how many there will be before I start)? > > > > I am also interested in this just out of general curiosity. > > > > > > > > > > Thoughts: > > > > 1) I could create an environment and store the objects themselves in > there > > while keeping pointers in the vectors, but am not sure if this would be > > that efficient (guidance would be appreciated), or > > > > 2) Just keep them in R vectors and grow these myself (as push_back is > doing > > for me in the above), but that sounds like a pain and I'm not sure if the > > objects or just the pointers would be copied when I reassigned things > > (guidance would be appreciated again). Bare in mind that I keep pointers > in > > the vectors, but omitted that for the sake of clarity. > > > > > > > > > > Is there some other R type that would be suited to this, or a general > > approach? > > > > Lists in R (LISTSXP aka pairlists) are suited to appending (since that is > fast and trivial) and sequential processing. The only issue is that > pairlists are slow for random access. If you only want to load the polygons > and finalize, then you can hold them in a pairlist and at the end copy to a > generic vector (if random access is expected). DB applications typically > use a hybrid approach - allocate vector blocks and keep them in pairlists, > but that's probably an overkill for your use (if you really cared about > performance you wouldn't use sp objects for this ;)) > > Note that you only have to protect the top-level object, so you don't need > to protect the individual elements. > > Cheers, > Simon > > > > Cheers and thanks in advance, > > Simon Knapp > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel