[Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?
Hello, We are currently making lots of changes to Rcpp (see the open Rcpp mailing list if interested [1] in the details). We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage collection instead of the PROTECT/UNPROTECT dance. This seems to work well, but I was wondering if there was documentation about it. In particular, if we preserve the same SEXP twice (or more), should we implement some sort of reference counting ? Reading the source (below, from memory.c) I think not, but some confirmation would help. void R_PreserveObject(SEXP object) { R_PreciousList = CONS(object, R_PreciousList); } static SEXP RecursiveRelease(SEXP object, SEXP list) { if (!isNull(list)) { if (object == CAR(list)) return CDR(list); else CDR(list) = RecursiveRelease(object, CDR(list)); } return list; } void R_ReleaseObject(SEXP object) { R_PreciousList = RecursiveRelease(object, R_PreciousList); } I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. Thanks, Romain [1] http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] How x[, 'colname1'] is implemented?
On Fri, Jan 1, 2010 at 9:40 PM, Peng Yu wrote: > I'm not complaining that it is not documented. Yes but you didn't answer my question. When you ask a question on these (or any mailing lists) you should always say what efforts you've made to answer the question. I first looked in the help for '[' but didn't find anything there. If you had already done that and told us then I wouldn't have wasted my time. If you'd gone on to say you'd also looked in the source code, and in which files, then I wouldn't have wasted my time looking in those files. Note that this is all for your benefit as well because we'll be able to help you quicker! >> As Obi-wan Kenobi may have said in Star Wars: "Use the source, Luke!": > Could you explain what ns and nx represent? No, you need to take Obi-wan's advice (or Seth Falcon's, who got to this before me) and 'Use the source'. I would hope that since you have an understanding of what a hashing algorithm is that you should also have some understanding of programming, and even if not in C it's not too hard to figure out. You could make the effort to look in subscript.c and work it out for yourself. If you are very interested in the low-level working of R then you should compile R with debugging turned on, then run R in a debugger and set a breakpoint in the array subscript function. Then you can inspect the C variables when you run it. Barry __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] loess() crashes R on my system
Greetings and happy new year! I am in the process of converting some of the old S-PLUS scripts from Visualizing Data (Cleveland, 1993) into lattice. In fact, I did most of it several years ago, and at the time, all of the scripts that contained loess() worked fine. Tonight, I ran most of the scripts again, but every one that I tried with a loess() call crashed R. I tried it in two sessions, one with an existing .Rdata and in another directory without one. The call below, isolated from its script, is one of the culprits: with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2, parametric = "C", drop.square = "C", family="s")) In response to the Posting Guide, when I say 'crash' I mean that the system abnormally terminated R when this line or any other one that invokes loess() is run. I'm running the binary version of 2.10.1 on a Dell Studio 15 with Windows 7 Home Premium. This is the first time I've ever experienced something like this in the decade that I've used R. What's more surprising is that it occurred with such a common function. Since I haven't seen any postings of this type, it's probably something on my end, but I don't know what. Since it shut down the program, I felt compelled to report it. I am subscribed to r-help but not r-devel, so please cc any response to me. Thanks in advance for any assistance. Dennis Murphy > sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.10.1 [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] loess() crashes R on my system
Dennis Murphy wrote: Greetings and happy new year! I am in the process of converting some of the old S-PLUS scripts from Visualizing Data (Cleveland, 1993) into lattice. In fact, I did most of it several years ago, and at the time, all of the scripts that contained loess() worked fine. Tonight, I ran most of the scripts again, but every one that I tried with a loess() call crashed R. I tried it in two sessions, one with an existing .Rdata and in another directory without one. The call below, isolated from its script, is one of the culprits: with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2, parametric = "C", drop.square = "C", family="s")) I just did replicate(1000, with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2, parametric = "C", drop.square = "C", family="s"))) under Windows XP SP3 with R-2.10.1 / lattice_0.17-26 and it worked fine: Sorry, I cannot reproduce it. You forgot to report your lattice version below. Best, Uwe Ligges In response to the Posting Guide, when I say 'crash' I mean that the system abnormally terminated R when this line or any other one that invokes loess() is run. I'm running the binary version of 2.10.1 on a Dell Studio 15 with Windows 7 Home Premium. This is the first time I've ever experienced something like this in the decade that I've used R. What's more surprising is that it occurred with such a common function. Since I haven't seen any postings of this type, it's probably something on my end, but I don't know what. Since it shut down the program, I felt compelled to report it. I am subscribed to r-help but not r-devel, so please cc any response to me. Thanks in advance for any assistance. Dennis Murphy sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.10.1 [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] loess() crashes R on my system
Dennis, I have no problems with this on my DELL laptop (still running Vista) with R version 2.10.1 Patched (2009-12-21 r50796). -Peter Ehlers Dennis Murphy wrote: Greetings and happy new year! I am in the process of converting some of the old S-PLUS scripts from Visualizing Data (Cleveland, 1993) into lattice. In fact, I did most of it several years ago, and at the time, all of the scripts that contained loess() worked fine. Tonight, I ran most of the scripts again, but every one that I tried with a loess() call crashed R. I tried it in two sessions, one with an existing .Rdata and in another directory without one. The call below, isolated from its script, is one of the culprits: with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2, parametric = "C", drop.square = "C", family="s")) In response to the Posting Guide, when I say 'crash' I mean that the system abnormally terminated R when this line or any other one that invokes loess() is run. I'm running the binary version of 2.10.1 on a Dell Studio 15 with Windows 7 Home Premium. This is the first time I've ever experienced something like this in the decade that I've used R. What's more surprising is that it occurred with such a common function. Since I haven't seen any postings of this type, it's probably something on my end, but I don't know what. Since it shut down the program, I felt compelled to report it. I am subscribed to r-help but not r-devel, so please cc any response to me. Thanks in advance for any assistance. Dennis Murphy sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.10.1 [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Ehlers University of Calgary 403.202.3921 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] loess() crashes R on my system
The posting guide also applies to R-devel: please provide a complete reproducible example and don't send HTML, as requested there. On Fri, 1 Jan 2010, Dennis Murphy wrote: Greetings and happy new year! I am in the process of converting some of the old S-PLUS scripts from Visualizing Data (Cleveland, 1993) into lattice. In fact, I did most of it several years ago, and at the time, all of the scripts that contained loess() worked fine. Tonight, I ran most of the scripts again, but every one that I tried with a loess() call crashed R. I tried it in two sessions, one with an existing .Rdata and in another directory without one. The call below, isolated from its script, is one of the culprits: with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2, parametric = "C", drop.square = "C", family="s")) What is 'ethanol'? Is it the data set in package lattice, for example? (If so, this works for me everywhere, including Windows XP.) In response to the Posting Guide, when I say 'crash' I mean that the system abnormally terminated R when this line or any other one that invokes loess() is run. I'm running the binary version of 2.10.1 on a Dell Studio 15 with Windows 7 Home Premium. If you mean any call to loess(), the problem is your machine (or at least your precise version of the OS) since example(loess) has been run on the binary version of R before distribution. The examples are always a good place to start testing.) Problems specific to the Windows binary (and we don't know that as yet but it seems likely) should be reported to the R-windows address. This is the first time I've ever experienced something like this in the decade that I've used R. What's more surprising is that it occurred with such a common function. Since I haven't seen any postings of this type, it's probably something on my end, but I don't know what. Since it shut down the program, I felt compelled to report it. You should be able to get a problem report under those circumstances, usually as an option on the dialog box -- and you will need to report that with the reproducible example. I am subscribed to r-help but not r-devel, so please cc any response to me. Thanks in advance for any assistance. Dennis Murphy sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.10.1 [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 83, Issue 2
[Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: Hello, We are currently making lots of changes to Rcpp (see the open Rcpp mailing list if interested [1] in the details). We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage collection instead of the PROTECT/UNPROTECT dance. This seems to work well, but I was wondering if there was documentation about it. The most precise technical documentation is in memory.c PROTECT is an alias for Rf_protect, itself an alias for SEXP protect(SEXP s); and uses a stack (R_PPStack) to store protected objects. In particular, if we preserve the same SEXP twice (or more), should we implement some sort of reference counting ? This depends on the requirements for your system. For example, in rpy2 I added a reference counting layer(*) because I wanted to allow several Python objects to share the same underlying R object, but that's not currently(*) counting how many times an object should be freed. (*: imperfect, but currently doing a very decent job - details upon request). That kind of feature could be provided by R's C-level API, since this could be seen of general use as well as give an opportunity to improve the performances of the R_PreservedObject/R_ReleaseObject duo whenever a lot of objects are protected and/or external code is protecting/releasing objects through a FIFO proxy. Reading the source (below, from memory.c) I think not, but some confirmation would help. I understand the code in memory.c like an object preserved twice needs to be freed twice: R_PreciousList is just a (linked) list, and "R_PreserveObject(object)" adds the object to the beginning of the list while "R_ReleaseObject(object)" removes the first "object" found from the list. void R_PreserveObject(SEXP object) { R_PreciousList = CONS(object, R_PreciousList); } static SEXP RecursiveRelease(SEXP object, SEXP list) { if (!isNull(list)) { if (object == CAR(list)) return CDR(list); else CDR(list) = RecursiveRelease(object, CDR(list)); } return list; } void R_ReleaseObject(SEXP object) { R_PreciousList = RecursiveRelease(object, R_PreciousList); } I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack HTH, L. Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?
Thanks. On 01/02/2010 05:36 PM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: Hello, We are currently making lots of changes to Rcpp (see the open Rcpp mailing list if interested [1] in the details). We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage collection instead of the PROTECT/UNPROTECT dance. This seems to work well, but I was wondering if there was documentation about it. The most precise technical documentation is in memory.c PROTECT is an alias for Rf_protect, itself an alias for SEXP protect(SEXP s); and uses a stack (R_PPStack) to store protected objects. In particular, if we preserve the same SEXP twice (or more), should we implement some sort of reference counting ? This depends on the requirements for your system. We wrap up SEXP into a C++ class that in particular manages itself preserving and releasing the object to garbage collection. For example, in rpy2 I added a reference counting layer(*) because I wanted to allow several Python objects to share the same underlying R object, but that's not currently(*) counting how many times an object should be freed. (*: imperfect, but currently doing a very decent job - details upon request). That kind of feature could be provided by R's C-level API, since this could be seen of general use as well as give an opportunity to improve the performances of the R_PreservedObject/R_ReleaseObject duo whenever a lot of objects are protected and/or external code is protecting/releasing objects through a FIFO proxy. Reading the source (below, from memory.c) I think not, but some confirmation would help. I understand the code in memory.c like an object preserved twice needs to be freed twice: R_PreciousList is just a (linked) list, and "R_PreserveObject(object)" adds the object to the beginning of the list while "R_ReleaseObject(object)" removes the first "object" found from the list. That's what I understand of it too. It fits nicely into C++ assignment operator and copy constructor. void R_PreserveObject(SEXP object) { R_PreciousList = CONS(object, R_PreciousList); } static SEXP RecursiveRelease(SEXP object, SEXP list) { if (!isNull(list)) { if (object == CAR(list)) return CDR(list); else CDR(list) = RecursiveRelease(object, CDR(list)); } return list; } void R_ReleaseObject(SEXP object) { R_PreciousList = RecursiveRelease(object, R_PreciousList); } I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack HTH, L. Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: Hello, We are currently making lots of changes to Rcpp (see the open Rcpp mailing list if interested [1] in the details). We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage collection instead of the PROTECT/UNPROTECT dance. This seems to work well, but I was wondering if there was documentation about it. The most precise technical documentation is in memory.c PROTECT is an alias for Rf_protect, itself an alias for SEXP protect(SEXP s); and uses a stack (R_PPStack) to store protected objects. In particular, if we preserve the same SEXP twice (or more), should we implement some sort of reference counting ? This depends on the requirements for your system. For example, in rpy2 I added a reference counting layer(*) because I wanted to allow several Python objects to share the same underlying R object, but that's not currently(*) counting how many times an object should be freed. (*: imperfect, but currently doing a very decent job - details upon request). That kind of feature could be provided by R's C-level API, since this could be seen of general use as well as give an opportunity to improve the performances of the R_PreservedObject/R_ReleaseObject duo whenever a lot of objects are protected and/or external code is protecting/releasing objects through a FIFO proxy. Reading the source (below, from memory.c) I think not, but some confirmation would help. I understand the code in memory.c like an object preserved twice needs to be freed twice: R_PreciousList is just a (linked) list, and "R_PreserveObject(object)" adds the object to the beginning of the list while "R_ReleaseObject(object)" removes the first "object" found from the list. void R_PreserveObject(SEXP object) { R_PreciousList = CONS(object, R_PreciousList); } static SEXP RecursiveRelease(SEXP object, SEXP list) { if (!isNull(list)) { if (object == CAR(list)) return CDR(list); else CDR(list) = RecursiveRelease(object, CDR(list)); } return list; } void R_ReleaseObject(SEXP object) { R_PreciousList = RecursiveRelease(object, R_PreciousList); } I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Another possibility is to maintain your own list or environment of objects, and just protect/preserve the list as a whole. Duncan Murdoch HTH, L. Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?
On Jan 2, 2010, at 5:07 AM, Romain Francois wrote: > Hello, > > We are currently making lots of changes to Rcpp (see the open Rcpp mailing > list if interested [1] in the details). > > We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage > collection instead of the PROTECT/UNPROTECT dance. This seems to work well, > but I was wondering if there was documentation about it. > I don't think so - the only documentation is the comment in the source. > In particular, if we preserve the same SEXP twice (or more), should we > implement some sort of reference counting ? > Preserve/Release are for managing objects that are supposed to survive past the call and are not tied to any other R object. PROTECT/UNPROTECT are for temporary preservation within a call. Although you're right that Preserve/Release is effectively implemented as a stack at the moment it is not stated explicitly anywhere (this goes all the way back to R 0.64 so chances are that only Ross can comment..). However, for practical purposes it would be potentially dangerous to have it work like a flag because you can simply never know whether the same object was not already registered by some other code. > Reading the source (below, from memory.c) I think not, but some confirmation > would help. > > void R_PreserveObject(SEXP object) > { >R_PreciousList = CONS(object, R_PreciousList); > } > > static SEXP RecursiveRelease(SEXP object, SEXP list) > { >if (!isNull(list)) { > if (object == CAR(list)) > return CDR(list); > else > CDR(list) = RecursiveRelease(object, CDR(list)); >} >return list; > } > > void R_ReleaseObject(SEXP object) > { >R_PreciousList = RecursiveRelease(object, R_PreciousList); > } > > > I'd also be interested if there is some ideas on the relative efficiency of > the preserve/release mechanism compared to PROTECT/UNPROTECT. > PROTECT/UNPROTECT is more efficient because all it does is a pointer assignment -- Preserve has to allocate new node and fill it with all parts. On release the extra node is still floating in the GC pool etc. Normally there is not really a question of choice - within a call you want to use PROTET/UNPROTECT and for anything else you simply cannot use it so you have to use Preserve/Release. As a side note Preserve/Release is merely a convenience call, it is often more efficient to simply assign the object to another object you have control of (which is all Preserve really does). Cheers, Simon __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. > I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Thanks. I did not know about UNPROTECT_PTR. I had concerns over the stack usage, but so far it did not prove too much of a problem. Still, why isn't the tail recursion explicitly made an iteration then ? This would take the "may be the compiler figures it out, may be not" variable out of the equation. Another possibility is to maintain your own list or environment of objects, and just protect/preserve the list as a whole. Interesting idea, this would let one perform his/her own bookkeeping on the list/environment. How is R garbage collection checking contained objects ? (I am thinking performances here, and may be cyclic references). L. Duncan Murdoch HTH, L. Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?
On 01/02/2010 06:01 PM, Simon Urbanek wrote: On Jan 2, 2010, at 5:07 AM, Romain Francois wrote: Hello, We are currently making lots of changes to Rcpp (see the open Rcpp mailing list if interested [1] in the details). We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage collection instead of the PROTECT/UNPROTECT dance. This seems to work well, but I was wondering if there was documentation about it. I don't think so - the only documentation is the comment in the source. Fair enough. FWIW, some mention of it in the R-ints or R-exts could be valuable. In particular, if we preserve the same SEXP twice (or more), should we implement some sort of reference counting ? Preserve/Release are for managing objects that are supposed to survive past the call and are not tied to any other R object. PROTECT/UNPROTECT are for temporary preservation within a call. Although you're right that Preserve/Release is effectively implemented as a stack at the moment it is not stated explicitly anywhere (this goes all the way back to R 0.64 so chances are that only Ross can comment..). However, for practical purposes it would be potentially dangerous to have it work like a flag because you can simply never know whether the same object was not already registered by some other code. Reading the source (below, from memory.c) I think not, but some confirmation would help. void R_PreserveObject(SEXP object) { R_PreciousList = CONS(object, R_PreciousList); } static SEXP RecursiveRelease(SEXP object, SEXP list) { if (!isNull(list)) { if (object == CAR(list)) return CDR(list); else CDR(list) = RecursiveRelease(object, CDR(list)); } return list; } void R_ReleaseObject(SEXP object) { R_PreciousList = RecursiveRelease(object, R_PreciousList); } I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is more efficient because all it does is a pointer assignment -- Preserve has to allocate new node and fill it with all parts. On release the extra node is still floating in the GC pool etc. Normally there is not really a question of choice - within a call you want to use PROTET/UNPROTECT and for anything else you simply cannot use it so you have to use Preserve/Release. As a side note Preserve/Release is merely a convenience call, it is often more efficient to simply assign the object to another object you have control of (which is all Preserve really does). Cheers, Simon Thanks for this advice. This will make it easier to do the bookkeeping of how many objects we preserve, etc ... Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 01/02/2010 05:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: Hello, We are currently making lots of changes to Rcpp (see the open Rcpp mailing list if interested [1] in the details). We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage collection instead of the PROTECT/UNPROTECT dance. This seems to work well, but I was wondering if there was documentation about it. The most precise technical documentation is in memory.c PROTECT is an alias for Rf_protect, itself an alias for SEXP protect(SEXP s); and uses a stack (R_PPStack) to store protected objects. In particular, if we preserve the same SEXP twice (or more), should we implement some sort of reference counting ? This depends on the requirements for your system. For example, in rpy2 I added a reference counting layer(*) because I wanted to allow several Python objects to share the same underlying R object, but that's not currently(*) counting how many times an object should be freed. (*: imperfect, but currently doing a very decent job - details upon request). That kind of feature could be provided by R's C-level API, since this could be seen of general use as well as give an opportunity to improve the performances of the R_PreservedObject/R_ReleaseObject duo whenever a lot of objects are protected and/or external code is protecting/releasing objects through a FIFO proxy. Reading the source (below, from memory.c) I think not, but some confirmation would help. I understand the code in memory.c like an object preserved twice needs to be freed twice: R_PreciousList is just a (linked) list, and "R_PreserveObject(object)" adds the object to the beginning of the list while "R_ReleaseObject(object)" removes the first "object" found from the list. void R_PreserveObject(SEXP object) { R_PreciousList = CONS(object, R_PreciousList); } static SEXP RecursiveRelease(SEXP object, SEXP list) { if (!isNull(list)) { if (object == CAR(list)) return CDR(list); else CDR(list) = RecursiveRelease(object, CDR(list)); } return list; } void R_ReleaseObject(SEXP object) { R_PreciousList = RecursiveRelease(object, R_PreciousList); } I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. Thanks; I've used those when I played with the parser's code (for highlight). It felt slightly harder to use than preserve/release since you have to maintain the protect index, etc ... I think I'll go as you say below, just maintain my own precious list, the way Preserve/Release does it. Romain I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Another possibility is to maintain your own list or environment of objects, and just protect/preserve the list as a whole. Duncan Murdoch -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?
On 1/2/10 5:50 PM, Romain Francois wrote: Thanks. On 01/02/2010 05:36 PM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) In particular, if we preserve the same SEXP twice (or more), should we implement some sort of reference counting ? This depends on the requirements for your system. We wrap up SEXP into a C++ class that in particular manages itself preserving and releasing the object to garbage collection. If you do not allow several C++ instances to share the same SEXP, you should not need reference counting. In the case of rpy2, this was made necessary for allowing inheritance (as nested calls to constructors can cause a transient sharing of the given SEXP across several Python objects). (...) L. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: > On 1/2/10 5:56 PM, Duncan Murdoch wrote: >> On 02/01/2010 11:36 AM, Laurent Gautier wrote: >>> [Disclaimer: what is below reflects my understanding from reading the >>> R source, others will correct where deemed necessary] >>> >>> On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: > > (...) > I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. >>> >>> PROTECT/UNPROTECT is trading granularity for speed. It is a stack with >>> only tow operations possible: >>> - push 1 object into the stack >>> - pull (unprotect) N last objects from the stack >> >> UNPROTECT_PTR is also possible, which does a linear search through the >> stack and unprotects something possibly deep within it. There is also >> REPROTECT which allows you to replace an entry within the stack. > > >> I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease >> because it doesn't use so much stack space when it needs to go deep into >> the stack to release, but it is possible the compiler recognizes the >> tail recursion and RecursiveRelease is implemented efficiently. In that >> case it could be more efficient than UNPROTECT_PTR, which has to move >> all the other entries down to fill the newly vacated space. Really the >> only reliable way to answer efficiency questions like this is to try >> both ways and see which works better in your application. > > Thanks. I did not know about UNPROTECT_PTR. > I had concerns over the stack usage, but so far it did not prove too much of > a problem. Still, why isn't the tail recursion explicitly made an iteration > then ? This would take the "may be the compiler figures it out, may be not" > variable out of the equation. > Careful - the protection stack (bookkeeping by R) has nothing to do with the C function call stack hence it has nothing to do with the compiler. The protection stack is global so usually you don't run out of it unless something goes horribly wrong (=infinite loop). >> Another possibility is to maintain your own list or environment of >> objects, and just protect/preserve the list as a whole. > > Interesting idea, this would let one perform his/her own bookkeeping on the > list/environment. How is R garbage collection checking contained objects ? (I > am thinking performances here, and may be cyclic references). > You don't really want to care because the GC is global for all objects (and cycles are supported by the GC in R) - so whether you keep it yourself or Preserve/Release is practically irrelevant (the protection stack is handled separately). As for keeping your own list -- if you really care about performance that much (to be honest in vast majority of cases it doesn't matter) you have to be careful how you implement it. Technically the fastest way is preallocated generic vector but it really depends on how you deal with the access so you can easily perform worse than Preserve/Release if you're not careful. As a side note - the best way (IMHO) to deal with all those issues is to use external pointers because a) you get very efficient C finalizers b) you can directly (and very efficiently) tie lifespan of other objects to the same SEXP and c) as guardians they can nicely track other objects that hold them. Cheers, Simon > > > L. > > >> Duncan Murdoch >> >>> >>> >>> HTH, >>> >>> >>> L. >>> >>> >>> >>> Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex >>> >>> __ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 83, Issue 2
Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Thanks. I did not know about UNPROTECT_PTR. I had concerns over the stack usage, but so far it did not prove too much of a problem. Still, why isn't the tail recursion explicitly made an iteration then ? This would take the "may be the compiler figures it out, may be not" variable out of the equation. Careful - the protection stack (bookkeeping by R) has nothing to do with the C function call stack hence it has nothing to do with the compiler. The protection stack is global so usually you don't run out of it unless something goes horribly wrong (=infinite loop). I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered. Duncan Murdoch Another possibility is to maintain your own list or environment of objects, and just protect/preserve the list as a whole. Interesting idea, this would let one perform his/her own bookkeeping on the list/environment. How is R garbage collection checking contained objects ? (I am thinking performances here, and may be cyclic references). You don't really want to care because the GC is global for all objects (and cycles are supported by the GC in R) - so whether you keep it yourself or Preserve/Release is practically irrelevant (the protection stack is handled separately). As for keeping your own list -- if you really care about performance that much (to be honest in vast majority of cases it doesn't matter) you have to be careful how you implement it. Technically the fastest way is preallocated generic vector but it really depends on how you deal with the access so you can easily perform worse than Preserve/Release if you're not careful. As a side note - the best way (IMHO) to deal with all those issues is to use external pointers because a) you get very efficient C finalizers b) you can directly (and very efficiently) tie lifespan of other objects to the same SEXP and c) as guardians they can nicely track other objects that hold them. Cheers, Simon L. Duncan Murdoch HTH, L. Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On Sat, 2 Jan 2010, Duncan Murdoch wrote: Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Thanks. I did not know about UNPROTECT_PTR. I had concerns over the stack usage, but so far it did not prove too much of a problem. Still, why isn't the tail recursion explicitly made an iteration then ? This would take the "may be the compiler figures it out, may be not" variable out of the equation. Careful - the protection stack (bookkeeping by R) has nothing to do with the C function call stack hence it has nothing to do with the compiler. The protection stack is global so usually you don't run out of it unless something goes horribly wrong (=infinite loop). I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) I believe current gcc -O2 does optimize tail recursive calls (more generally sibling calls) but RecursiveRelease isn't written as a tail recursion -- the return vaule is used in an assignment. In any case it would probably make more sense to rewrite RecursiveRelease to not use recursion at all, but I'm not motivated to do that at this point. luke Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered. Duncan Murdoch Another possibility is to maintain your own list or environment of objects, and just protect/preserve the list as a whole. Interesting idea, this would let one perform his/her own bookkeeping on the list/environment. How is R garbage collection checking contained objects ? (I am thinking performances here, and may be cyclic references). You don't really want to care because the GC is global for all objects (and cycles are supported by the GC in R) - so whether you keep it yourself or Preserve/Release is practically irrelevant (the protection stack is handled separately). As for keeping your own list -- if you really care about performance that much (to be honest in vast majority of cases it doesn't matter) you have to be careful how you implement it. Technically the fastest way is preallocated generic vector but it really depends on how you deal with the access so you can easily perform worse than Preserve/Release if you're not careful. As a side note - the best way (IMHO) to deal with all those issues is to use external pointers because a) you get very efficient C finalizers b) you can directly (and very efficiently) tie lifespan of other objects to the same SEXP and c) as guardians they can nicely track other objects that hold them. Cheers, Simon L. Duncan Murdoch HTH, L. Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/m
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 1/2/10 8:53 PM, Duncan Murdoch wrote: Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Thanks. I did not know about UNPROTECT_PTR. I had concerns over the stack usage, but so far it did not prove too much of a problem. Still, why isn't the tail recursion explicitly made an iteration then ? This would take the "may be the compiler figures it out, may be not" variable out of the equation. Careful - the protection stack (bookkeeping by R) has nothing to do with the C function call stack hence it has nothing to do with the compiler. The protection stack is global so usually you don't run out of it unless something goes horribly wrong (=infinite loop). I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered. Yes, I was referring to RecursiveRelease. Sorry if this was not clear. What are the chances for a patch to be accepted ? At first sight(*), making that tail recursion an iterative function is not a major undertaking, and reviewing the patch be fairly straightforward... but I can always use that time otherwise if the answer to the question is "nil". L. Duncan Murdoch Another possibility is to maintain your own list or environment of objects, and just protect/preserve the list as a whole. Interesting idea, this would let one perform his/her own bookkeeping on the list/environment. How is R garbage collection checking contained objects ? (I am thinking performances here, and may be cyclic references). You don't really want to care because the GC is global for all objects (and cycles are supported by the GC in R) - so whether you keep it yourself or Preserve/Release is practically irrelevant (the protection stack is handled separately). As for keeping your own list -- if you really care about performance that much (to be honest in vast majority of cases it doesn't matter) you have to be careful how you implement it. Technically the fastest way is preallocated generic vector but it really depends on how you deal with the access so you can easily perform worse than Preserve/Release if you're not careful. As a side note - the best way (IMHO) to deal with all those issues is to use external pointers because a) you get very efficient C finalizers b) you can directly (and very efficiently) tie lifespan of other objects to the same SEXP and c) as guardians they can nicely track other objects that hold them. Cheers, Simon L. Duncan Murdoch HTH, L. Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 1/2/10 8:28 PM, Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) Another possibility is to maintain your own list or environment of objects, and just protect/preserve the list as a whole. Interesting idea, this would let one perform his/her own bookkeeping on the list/environment. How is R garbage collection checking contained objects ? (I am thinking performances here, and may be cyclic references). You don't really want to care because the GC is global for all objects (and cycles are supported by the GC in R) - so whether you keep it yourself or Preserve/Release is practically irrelevant (the protection stack is handled separately). I guess that I'll have to know in order to understand that I don't really want to care. ;-) The garbage collector must somehow know if an object is available for collection (and will have to check whether an object is PROTECTed or not, or Preserved or not). I suppose that upon being called the garbage collector will first look into the PROTECTed and Preserved objects, mark them as unavailble for collection, then recursively mark objects contained in them. As for keeping your own list -- if you really care about performance that much (to be honest in vast majority of cases it doesn't matter) you have to be careful how you implement it. Technically the fastest way is preallocated generic vector but it really depends on how you deal with the access so you can easily perform worse than Preserve/Release if you're not careful. Releasing being of linear complexity, having few thousands of Preserved objects not being anticipated as an extraordinary situation, and Preserve/Release cycles being quite frequent, I start minding a bit about the performance. Keeping my own list would let me experiment with various strategies (and eventually offer As a side note - the best way (IMHO) to deal with all those issues is to use external pointers because a) you get very efficient C finalizers b) you can directly (and very efficiently) tie lifespan of other objects to the same SEXP and c) as guardians they can nicely track other objects that hold them. Thanks. I am not certain to follow everything. Are you suggesting that rather than Preserve-ing/Release-ing a list/environment that would act as a guardian for several objects, one should use an external pointer (to an arbitrary C pointer) ? In that case, how does one indicate that an external pointer acts as a container ? Or are you suggesting that rather than Preserve-in/Release-ing R objects one should use an external pointer acting as a proxy for a SEXP (argument "prot" in R_MakeExternalPtr(void *p, SEXP tag, SEXP prot) ) ? (but in that case the external pointer will itself have to go through Preserve/Release cycles...) Cheers, Laurent Cheers, Simon L. Duncan Murdoch HTH, L. Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects `- http://tr.im/HlX9 : new package : bibtex __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?
Romain, Is the use of UNPROTECT_PTR discouraged? I wonder why you haven't considered using it instead. I have a similar project that uses a ref counting scheme and handles the deletion of the shared object with UNPROTECT_PTR. This method has been working fine, but if there are reasons I should not be using it, I would certainly like to know. >From memory.c: /* "unprotect_ptr" remove pointer from somewhere in R_PPStack */ The code is very simple, it just walks backwards down the list until it hits your object. Unless you are adding an obscene number of items to the protection stack in a given call, finding your object should be very fast. My example is here: http://github.com/armstrtw/rabstraction/blob/master/R.backend.hpp template Rbackend::~Rbackend() { if(release_data_) { if(R_object_!=R_NilValue) { UNPROTECT_PTR(R_object_); } } } I still think all these Rcpp projects should come together some day. Dirk and I talked about combining about two years ago, but "real" work got it the way. I'll ping you off list for a follow up. -Whit On Sat, Jan 2, 2010 at 12:24 PM, Romain Francois wrote: > On 01/02/2010 06:01 PM, Simon Urbanek wrote: >> >> >> On Jan 2, 2010, at 5:07 AM, Romain Francois wrote: >> >>> Hello, >>> >>> We are currently making lots of changes to Rcpp (see the open Rcpp >>> mailing list if interested [1] in the details). >>> >>> We are now using [2] R_PreserveObject and R_ReleaseObject to manage >>> garbage collection instead of the PROTECT/UNPROTECT dance. This seems to >>> work well, but I was wondering if there was documentation about it. >>> >> >> I don't think so - the only documentation is the comment in the source. > > > Fair enough. FWIW, some mention of it in the R-ints or R-exts could be > valuable. > > >>> In particular, if we preserve the same SEXP twice (or more), should we >>> implement some sort of reference counting ? >>> >> >> Preserve/Release are for managing objects that are supposed to survive >> past the call and are not tied to any other R object. PROTECT/UNPROTECT are >> for temporary preservation within a call. >> >> Although you're right that Preserve/Release is effectively implemented as >> a stack at the moment it is not stated explicitly anywhere (this goes all >> the way back to R 0.64 so chances are that only Ross can comment..). >> However, for practical purposes it would be potentially dangerous to have it >> work like a flag because you can simply never know whether the same object >> was not already registered by some other code. >> >> >>> Reading the source (below, from memory.c) I think not, but some >>> confirmation would help. >>> >>> void R_PreserveObject(SEXP object) >>> { >>> R_PreciousList = CONS(object, R_PreciousList); >>> } >>> >>> static SEXP RecursiveRelease(SEXP object, SEXP list) >>> { >>> if (!isNull(list)) { >>> if (object == CAR(list)) >>> return CDR(list); >>> else >>> CDR(list) = RecursiveRelease(object, CDR(list)); >>> } >>> return list; >>> } >>> >>> void R_ReleaseObject(SEXP object) >>> { >>> R_PreciousList = RecursiveRelease(object, R_PreciousList); >>> } >>> >>> >>> I'd also be interested if there is some ideas on the relative efficiency >>> of the preserve/release mechanism compared to PROTECT/UNPROTECT. >>> >> >> PROTECT/UNPROTECT is more efficient because all it does is a pointer >> assignment -- Preserve has to allocate new node and fill it with all parts. >> On release the extra node is still floating in the GC pool etc. >> >> Normally there is not really a question of choice - within a call you want >> to use PROTET/UNPROTECT and for anything else you simply cannot use it so >> you have to use Preserve/Release. As a side note Preserve/Release is merely >> a convenience call, it is often more efficient to simply assign the object >> to another object you have control of (which is all Preserve really does). >> >> Cheers, >> Simon > > Thanks for this advice. This will make it easier to do the bookkeeping of > how many objects we preserve, etc ... > > Romain > > -- > Romain Francois > Professional R Enthusiast > +33(0) 6 28 91 30 30 > http://romainfrancois.blog.free.fr > |- http://tr.im/IW9B : C++ exceptions at the R level > |- http://tr.im/IlMh : CPP package: exposing C++ objects > `- http://tr.im/HlX9 : new package : bibtex > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?
On 2 January 2010 at 16:26, Whit Armstrong wrote: | I still think all these Rcpp projects should come together some day. | Dirk and I talked about combining about two years ago, but "real" work | got it the way. I'll ping you off list for a follow up. Have a look at the SVN archive for Rcpp on R-Forge and at Rcpp 0.7.1 which should be on CRAN "soon" (within days). You may like what Romain added as class RObject; there is more coming for environments etc, and external pointers already got added. That said, we are trying to maintain the fine balance of keeping the old interface of Rcpp maintained (as it is being used in a number of places, and as I feel quite strongly about maintaining published interfaces) yet add a few 'interesting' things which Romain had been pushing real well for our use in RProtoBuf and other places. There is also a mailing list for Rcpp-dev (with the usual subscribe-to-post) tricks. Dirk -- Three out of two people have difficulties with fractions. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?
On 01/02/2010 10:26 PM, Whit Armstrong wrote: Romain, Is the use of UNPROTECT_PTR discouraged? I wonder why you haven't considered using it instead. that's not what I said. It just felt not as easy to use as just grab a SEXP and say R_PreserveObject( x ) and then later R_ReleaseObject. I did not mean to discourage people from using it. It is probably very stable indeed since it is used all over the place in the parser, and since the parser is probably the most used code in the R source tree, we would know if it did not work properly. I have a similar project that uses a ref counting scheme and handles the deletion of the shared object with UNPROTECT_PTR. This method has been working fine, but if there are reasons I should not be using it, I would certainly like to know. From memory.c: /* "unprotect_ptr" remove pointer from somewhere in R_PPStack */ The code is very simple, it just walks backwards down the list until it hits your object. Unless you are adding an obscene number of items to the protection stack in a given call, finding your object should be very fast. My example is here: http://github.com/armstrtw/rabstraction/blob/master/R.backend.hpp > template Rbackend::~Rbackend() { if(release_data_) { if(R_object_!=R_NilValue) { UNPROTECT_PTR(R_object_); } } } Thanks. I'll have a look. I still think all these Rcpp projects should come together some day. Dirk and I talked about combining about two years ago, but "real" work got it the way. I'll ping you off list for a follow up. Dirk is proving very cooperative and the "Rcpp" package is under a lot of changes currently in its new Rcpp namespace, while keeping the current interface. Our main class is Rcpp::RObject that wraps any SEXP into a C++ object that basically R_PreserveObject when it is created, and R_ReleaseObject when the object is destroyed. deriving from "RObject", we have "Environment" to deal specifically with environments (ENVSXP) , "Symbol" to deal with symbols (SYMSXP), "Language" for calls (LANGSXP) and the template XPTr for external pointers. The rest (vectors, functions, ...) will follow. I'd suggest using the Rcpp mailing list for a follow up. https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -Whit On Sat, Jan 2, 2010 at 12:24 PM, Romain Francois wrote: On 01/02/2010 06:01 PM, Simon Urbanek wrote: On Jan 2, 2010, at 5:07 AM, Romain Francois wrote: Hello, We are currently making lots of changes to Rcpp (see the open Rcpp mailing list if interested [1] in the details). We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage collection instead of the PROTECT/UNPROTECT dance. This seems to work well, but I was wondering if there was documentation about it. I don't think so - the only documentation is the comment in the source. Fair enough. FWIW, some mention of it in the R-ints or R-exts could be valuable. In particular, if we preserve the same SEXP twice (or more), should we implement some sort of reference counting ? Preserve/Release are for managing objects that are supposed to survive past the call and are not tied to any other R object. PROTECT/UNPROTECT are for temporary preservation within a call. Although you're right that Preserve/Release is effectively implemented as a stack at the moment it is not stated explicitly anywhere (this goes all the way back to R 0.64 so chances are that only Ross can comment..). However, for practical purposes it would be potentially dangerous to have it work like a flag because you can simply never know whether the same object was not already registered by some other code. Reading the source (below, from memory.c) I think not, but some confirmation would help. void R_PreserveObject(SEXP object) { R_PreciousList = CONS(object, R_PreciousList); } static SEXP RecursiveRelease(SEXP object, SEXP list) { if (!isNull(list)) { if (object == CAR(list)) return CDR(list); else CDR(list) = RecursiveRelease(object, CDR(list)); } return list; } void R_ReleaseObject(SEXP object) { R_PreciousList = RecursiveRelease(object, R_PreciousList); } I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is more efficient because all it does is a pointer assignment -- Preserve has to allocate new node and fill it with all parts. On release the extra node is still floating in the GC pool etc. Normally there is not really a question of choice - within a call you want to use PROTET/UNPROTECT and for anything else you simply cannot use it so you have to use Preserve/Release. As a side note Preserve/Release is merely a convenience call, it is often more efficient to simply assign the object to another object you have control of (which is all Preserve really does). Cheers, Simon Thanks for this advice. Thi
Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?
Romain, > that's not what I said. It just felt not as easy to use as just grab a SEXP > and say R_PreserveObject( x ) and then later R_ReleaseObject. Didn't mean to suggest that you said that. I was asking the list whether it's discouraged. Will follow up on the Rcpp list. -Whit __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] readCitationFile encoding
On Thu, Dec 31, 2009 at 02:50:00PM -0500, Simon Urbanek wrote: > > On Dec 31, 2009, at 10:05 AM, Jens Elkner wrote: > > > OT: Wondering, what does "recommended portable encoding names" in this > > context mean (e.g. AFAIK 'latin1' is not a standardized name ...). > > It is in R -- since no standard name for that encoding exists in general R > maps "latin1" to whatever the OS expects this to be -- see ?Encoding. Ahh - ok. So since our staff, student, guests are using all kind of OS (i.e. win, solaris, linux, macosx) it is better to recommend ISO8859-15/UTF-8 instead of latin*, understood. > > Ahh - good hint - help("check") wasn't really fruitful. However, > > > > elkner.idev2 build/R-2.10.1/bin > ./check --help > > Can't locate File/Copy/Recursive.pm in @INC (@INC contains: ... > > > > So still need to do a little bit tweaking ... > > > > .. rather reading the docs ;). Hmm - think I already skimmed a lot of the onine docs. But at least wrt. check everything is for my taste quite shallow ... ;-( > It's R CMD check --help (since we're on R-devel knowledge of basics is > assumed). Well - it doesn't tell the details as well... BTW: the bad infinite recursion fails for 2.10.1 2009-12-14 patched (but not in the unpatched version): running code in 'ok-errors.R' ...*** Error code 1 The following command caused the error: LANGUAGE=en LC_ALL=C SRCDIR=. R_DEFAULT_PACKAGES= ../bin/R --vanilla < ok-errors.R > ok-errors.Rout 2>&1 || (mv ok-errors.Rout ok-errors.Rout.fail && exit 1) make: Fatal error: Command failed for target `ok-errors.Rout' Current working directory /export/scratch/elkner/build/R-2.10.1/tests *** Error code 1 Without optimization > ## bad infinite recursion / on.exit / ... interactions > bar <- function() 1+1 > foo <- function() { on.exit(bar()); foo() } > foo() # now simple "infinite recursion" Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: C stack usage is too close to the limit Error: segfault from C stack overflow Execution halted With optimization (-xO5 -xautopar) -- > ## bad infinite recursion / on.exit / ... interactions > bar <- function() 1+1 > foo <- function() { on.exit(bar()); foo() } > foo() # now simple "infinite recursion" *** caught segfault *** address fd7fff3fff90, cause 'memory not mapped' Traceback: 1: foo() ... 2489: foo() aborting ... With optimization (-xO5 -xautopar) but unpatched 2.10.1 --- > ## bad infinite recursion / on.exit / ... interactions > bar <- function() 1+1 > foo <- function() { on.exit(bar()); foo() } > foo() # now simple "infinite recursion" Error: evaluation nested too deeply: infinite recursion / options(expressions=)? > Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 02/01/2010 3:16 PM, Laurent Gautier wrote: On 1/2/10 8:53 PM, Duncan Murdoch wrote: Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Thanks. I did not know about UNPROTECT_PTR. I had concerns over the stack usage, but so far it did not prove too much of a problem. Still, why isn't the tail recursion explicitly made an iteration then ? This would take the "may be the compiler figures it out, may be not" variable out of the equation. Careful - the protection stack (bookkeeping by R) has nothing to do with the C function call stack hence it has nothing to do with the compiler. The protection stack is global so usually you don't run out of it unless something goes horribly wrong (=infinite loop). I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered. Yes, I was referring to RecursiveRelease. Sorry if this was not clear. What are the chances for a patch to be accepted ? At first sight(*), making that tail recursion an iterative function is not a major undertaking, and reviewing the patch be fairly straightforward... but I can always use that time otherwise if the answer to the question is "nil". I don't think I would want to review such a patch (I don't know the memory manager well, I don't know that there is really a case where it matters enough to be worth doing), so I'd say if you don't get a message from a core member volunteering to do so, you should assume it won't be accepted. But that doesn't mean you shouldn't write the code for your own internal use and edification, and if you can put together a demo that shows it really makes a big difference in a realistic situation, you might get a different response. Duncan Murdoch L. Duncan Murdoch Another possibility is to maintain your own list or environment of objects, and just protect/preserve the list as a whole. Interesting idea, this would let one perform his/her own bookkeeping on the list/environment. How is R garbage collection checking contained objects ? (I am thinking performances here, and may be cyclic references). You don't really want to care because the GC is global for all objects (and cycles are supported by the GC in R) - so whether you keep it yourself or Preserve/Release is practically irrelevant (the protection stack is handled separately). As for keeping your own list -- if you really care about performance that much (to be honest in vast majority of cases it doesn't matter) you have to be careful how you implement it. Technically the fastest way is preallocated generic vector but it really depends on how you deal with the access so you can easily perform worse than Preserve/Release if you're not careful. As a side note - the best way (IMHO) to deal with all those issues is to use external pointers because a) you get very efficient C finalizers b) you can directly (and very efficiently) tie lifespan of other objects to the same SEXP and c) as guardians they can nicely track other objects that hold them. Cheers, Simon L. Duncan Murdoch HTH, L. Thanks, Romain [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ [2] http://r-forge.r-project.org/plugins/scmsvn/v
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 01/02/2010 11:12 PM, Duncan Murdoch wrote: On 02/01/2010 3:16 PM, Laurent Gautier wrote: On 1/2/10 8:53 PM, Duncan Murdoch wrote: Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Thanks. I did not know about UNPROTECT_PTR. I had concerns over the stack usage, but so far it did not prove too much of a problem. Still, why isn't the tail recursion explicitly made an iteration then ? This would take the "may be the compiler figures it out, may be not" variable out of the equation. Careful - the protection stack (bookkeeping by R) has nothing to do with the C function call stack hence it has nothing to do with the compiler. The protection stack is global so usually you don't run out of it unless something goes horribly wrong (=infinite loop). I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered. Yes, I was referring to RecursiveRelease. Sorry if this was not clear. What are the chances for a patch to be accepted ? At first sight(*), making that tail recursion an iterative function is not a major undertaking, and reviewing the patch be fairly straightforward... but I can always use that time otherwise if the answer to the question is "nil". I don't think I would want to review such a patch (I don't know the memory manager well, I don't know that there is really a case where it matters enough to be worth doing), so I'd say if you don't get a message from a core member volunteering to do so, you should assume it won't be accepted. But that doesn't mean you shouldn't write the code for your own internal use and edification, and if you can put together a demo that shows it really makes a big difference in a realistic situation, you might get a different response. Duncan Murdoch From what I understand, this has little to do with the memory manager, and resumes to the simple problem "how to remove an object from a list". Something like this, using the amazing inline and inspect packages: require( inline ) require( inspect ) remover <- cfunction(signature( list = "language", object = "environment" ), ' if( !isNull( list ) ){ SEXP x = list ; SEXP y ; while( CAR(x) != object && CADR(x) != R_NilValue ){ y = x ; x = CDR(x) ; } if( CAR(x) == object ) SETCDR(y, CDR(x) ) ; } return list ; ', Rcpp=FALSE, verbose=FALSE ) e <- new.env() call <- call( "foo", e, e, 1:10, 3 ) call # inspect( call ) result <- remover( call ,e ) result # inspect( result ) gives this : foo(10, , 0, , 1) @0x9f4e0d0 06 LANGSXP [NAM(2)] @0x9f4e204 01 SYMSXP [] "foo" @0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056) @0x9f4d564 04 ENVSXP [NAM(1)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(2)] @0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344) @0x9f4d564 04 ENVSXP [NAM(1)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 01/02/2010 11:41 PM, Romain Francois wrote: On 01/02/2010 11:12 PM, Duncan Murdoch wrote: On 02/01/2010 3:16 PM, Laurent Gautier wrote: On 1/2/10 8:53 PM, Duncan Murdoch wrote: Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Thanks. I did not know about UNPROTECT_PTR. I had concerns over the stack usage, but so far it did not prove too much of a problem. Still, why isn't the tail recursion explicitly made an iteration then ? This would take the "may be the compiler figures it out, may be not" variable out of the equation. Careful - the protection stack (bookkeeping by R) has nothing to do with the C function call stack hence it has nothing to do with the compiler. The protection stack is global so usually you don't run out of it unless something goes horribly wrong (=infinite loop). I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered. Yes, I was referring to RecursiveRelease. Sorry if this was not clear. What are the chances for a patch to be accepted ? At first sight(*), making that tail recursion an iterative function is not a major undertaking, and reviewing the patch be fairly straightforward... but I can always use that time otherwise if the answer to the question is "nil". I don't think I would want to review such a patch (I don't know the memory manager well, I don't know that there is really a case where it matters enough to be worth doing), so I'd say if you don't get a message from a core member volunteering to do so, you should assume it won't be accepted. But that doesn't mean you shouldn't write the code for your own internal use and edification, and if you can put together a demo that shows it really makes a big difference in a realistic situation, you might get a different response. Duncan Murdoch From what I understand, this has little to do with the memory manager, and resumes to the simple problem "how to remove an object from a list". Something like this, using the amazing inline and inspect packages: require( inline ) require( inspect ) remover <- cfunction(signature( list = "language", object = "environment" ), ' if( !isNull( list ) ){ SEXP x = list ; SEXP y ; while( CAR(x) != object && CADR(x) != R_NilValue ){ y = x ; x = CDR(x) ; } if( CAR(x) == object ) SETCDR(y, CDR(x) ) ; } return list ; ', Rcpp=FALSE, verbose=FALSE ) e <- new.env() call <- call( "foo", e, e, 1:10, 3 ) call # inspect( call ) result <- remover( call ,e ) result # inspect( result ) gives this : foo(10, , 0, , 1) @0x9f4e0d0 06 LANGSXP [NAM(2)] @0x9f4e204 01 SYMSXP [] "foo" @0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056) @0x9f4d564 04 ENVSXP [NAM(1)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(2)] @0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344) @0x9f4d564 04 ENVSXP [NAM(1)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(2)] @0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709) NULL == foo(10, 0, , 1) @0x9f4e0d0 06 LANGSXP [NAM(2)] @0x9f4e204 01 SYMSXP [] "foo" @0xa907b7
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 1/2/10 11:41 PM, Romain Francois wrote: On 01/02/2010 11:12 PM, Duncan Murdoch wrote: On 02/01/2010 3:16 PM, Laurent Gautier wrote: On 1/2/10 8:53 PM, Duncan Murdoch wrote: Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) (...) I don't think I would want to review such a patch (I don't know the memory manager well, I don't know that there is really a case where it matters enough to be worth doing), so I'd say if you don't get a message from a core member volunteering to do so, you should assume it won't be accepted. But that doesn't mean you shouldn't write the code for your own internal use and edification, and if you can put together a demo that shows it really makes a big difference in a realistic situation, you might get a different response. Duncan Murdoch From what I understand, this has little to do with the memory manager, and resumes to the simple problem "how to remove an object from a list". I would generalize even further, up to: "how to make a tail recursion into an interation" (and that often boils down to writing a for loop - little edification to get from it, I suppose... one probably wants to write it to see it included). I see below that you are living up to the enthusiasm claimed, and had a shot at it (and even did benchmark to demonstrate the expected benefit - impressive). Let's see what happens with it... L. Something like this, using the amazing inline and inspect packages: require( inline ) require( inspect ) remover <- cfunction(signature( list = "language", object = "environment" ), ' if( !isNull( list ) ){ SEXP x = list ; SEXP y ; while( CAR(x) != object && CADR(x) != R_NilValue ){ y = x ; x = CDR(x) ; } if( CAR(x) == object ) SETCDR(y, CDR(x) ) ; } return list ; ', Rcpp=FALSE, verbose=FALSE ) e <- new.env() call <- call( "foo", e, e, 1:10, 3 ) call # inspect( call ) result <- remover( call ,e ) result # inspect( result ) gives this : foo(10, , 0, , 1) @0x9f4e0d0 06 LANGSXP [NAM(2)] @0x9f4e204 01 SYMSXP [] "foo" @0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056) @0x9f4d564 04 ENVSXP [NAM(1)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(2)] @0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344) @0x9f4d564 04 ENVSXP [NAM(1)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(2)] @0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709) NULL == foo(10, 0, , 1) @0x9f4e0d0 06 LANGSXP [NAM(2)] @0x9f4e204 01 SYMSXP [] "foo" @0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056) @0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344) @0x9f4d564 04 ENVSXP [NAM(2)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(2)] @0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709) NULL so it boils down to this as a replacement to RecursiveRelease : /** * Removes the first instance of object with the list */ static SEXP RemoveFromList( SEXP object, SEXP list){ if( !isNull( list ) ){ SEXP x = list ; SEXP y ; while( CAR(x) != object && CADR(x) != R_NilValue ){ y = x ; x = CDR(x) ; } if( CAR(x) == object ) SETCDR(y, CDR(x) ) ; } return list ; } Romain __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 83, Issue 2
On 02/01/2010 6:35 PM, Laurent Gautier wrote: On 1/2/10 11:41 PM, Romain Francois wrote: On 01/02/2010 11:12 PM, Duncan Murdoch wrote: On 02/01/2010 3:16 PM, Laurent Gautier wrote: On 1/2/10 8:53 PM, Duncan Murdoch wrote: Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) (...) I don't think I would want to review such a patch (I don't know the memory manager well, I don't know that there is really a case where it matters enough to be worth doing), so I'd say if you don't get a message from a core member volunteering to do so, you should assume it won't be accepted. But that doesn't mean you shouldn't write the code for your own internal use and edification, and if you can put together a demo that shows it really makes a big difference in a realistic situation, you might get a different response. Duncan Murdoch From what I understand, this has little to do with the memory manager, and resumes to the simple problem "how to remove an object from a list". I would generalize even further, up to: "how to make a tail recursion into an interation" (and that often boils down to writing a for loop - little edification to get from it, I suppose... one probably wants to write it to see it included). Just one followup: as Luke mentioned, I was wrong, it isn't really a tail recursion (which is why gcc didn't optimize it away). Duncan Murdoch I see below that you are living up to the enthusiasm claimed, and had a shot at it (and even did benchmark to demonstrate the expected benefit - impressive). Let's see what happens with it... L. Something like this, using the amazing inline and inspect packages: require( inline ) require( inspect ) remover <- cfunction(signature( list = "language", object = "environment" ), ' if( !isNull( list ) ){ SEXP x = list ; SEXP y ; while( CAR(x) != object && CADR(x) != R_NilValue ){ y = x ; x = CDR(x) ; } if( CAR(x) == object ) SETCDR(y, CDR(x) ) ; } return list ; ', Rcpp=FALSE, verbose=FALSE ) e <- new.env() call <- call( "foo", e, e, 1:10, 3 ) call # inspect( call ) result <- remover( call ,e ) result # inspect( result ) gives this : foo(10, , 0, , 1) @0x9f4e0d0 06 LANGSXP [NAM(2)] @0x9f4e204 01 SYMSXP [] "foo" @0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056) @0x9f4d564 04 ENVSXP [NAM(1)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(2)] @0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344) @0x9f4d564 04 ENVSXP [NAM(1)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(2)] @0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709) NULL == foo(10, 0, , 1) @0x9f4e0d0 06 LANGSXP [NAM(2)] @0x9f4e204 01 SYMSXP [] "foo" @0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056) @0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344) @0x9f4d564 04 ENVSXP [NAM(2)] FRAME: @0x9e94a10 00 NILSXP [MARK,NAM(2)] ENCLOS: @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)] HASHTAB: @0x9e94a10 00 NILSXP [MARK,NAM(2)] @0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709) NULL so it boils down to this as a replacement to RecursiveRelease : /** * Removes the first instance of object with the list */ static SEXP RemoveFromList( SEXP object, SEXP list){ if( !isNull( list ) ){ SEXP x = list ; SEXP y ; while( CAR(x) != object && CADR(x) != R_NilValue ){ y = x ; x = CDR(x) ; } if( CAR(x) == object ) SETCDR(y, CDR(x) ) ; } return list ; } Romain __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] R object protection [Was: R-devel Digest, Vol 83, Issue 2]
On Jan 2, 2010, at 4:08 PM, Laurent Gautier wrote: > On 1/2/10 8:28 PM, Simon Urbanek wrote: >> >> On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: >> >>> On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: > [Disclaimer: what is below reflects my understanding from > reading the R source, others will correct where deemed > necessary] > > On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: >>> > (...) > Another possibility is to maintain your own list or environment of objects, and just protect/preserve the list as a whole. >>> >>> Interesting idea, this would let one perform his/her own >>> bookkeeping on the list/environment. How is R garbage collection >>> checking contained objects ? (I am thinking performances here, and >>> may be cyclic references). >>> >> >> You don't really want to care because the GC is global for all >> objects (and cycles are supported by the GC in R) - so whether you >> keep it yourself or Preserve/Release is practically irrelevant (the >> protection stack is handled separately). > > I guess that I'll have to know in order to understand that I don't really > want to care. ;-) > The garbage collector must somehow know if an object is available for > collection (and will have to check whether an object is PROTECTed or not, or > Preserved or not). > I suppose that upon being called the garbage collector will first look into > the PROTECTed and Preserved objects, mark them as unavailble for collection, > then recursively mark objects contained in them. > The GC marks recursively from all known roots of which Preserved list is one of many and all elements of the protection stack are treated as such as well (FWIW the Preserved and protected list are in that order the last two). Since this involves (by definition) all live objects it doesn't matter to which other object you assign the node. The only detail is that protection stack does not change the generation (since there is no real node to assign to). >> As for keeping your own list -- if you really care about performance >> that much (to be honest in vast majority of cases it doesn't matter) >> you have to be careful how you implement it. Technically the fastest >> way is preallocated generic vector but it really depends on how you >> deal with the access so you can easily perform worse than >> Preserve/Release if you're not careful. > > Releasing being of linear complexity, having few thousands of Preserved > objects not being anticipated as an extraordinary situation, and > Preserve/Release cycles being quite frequent, I start minding a bit about the > performance. Keeping my own list would let me experiment with various > strategies (and eventually offer > Sure - what I meant is that you have to optimize for one thing or the other so you have to be careful what you do. >> As a side note - the best way (IMHO) to deal with all those issues is >> to use external pointers because a) you get very efficient C >> finalizers b) you can directly (and very efficiently) tie lifespan of >> other objects to the same SEXP and c) as guardians they can nicely >> track other objects that hold them. > > Thanks. I am not certain to follow everything. Are you suggesting that rather > than Preserve-ing/Release-ing a list/environment that would act as a guardian > for several objects, one should use an external pointer (to an arbitrary C > pointer) ? In that case, how does one indicate that an external pointer acts > as a container ? > > Or are you suggesting that rather than Preserve-in/Release-ing R objects one > should use an external pointer acting as a proxy for a SEXP (argument "prot" > in R_MakeExternalPtr(void *p, SEXP tag, SEXP prot) ) ? > (but in that case the external pointer will itself have to go through > Preserve/Release cycles...) > I was guessing that you use this in conjunction with some C++ magic not just plain R objects and thus you have to deal with two life spans. From the other messages I think you are dealing with the simple situation of wrapping an R object as reference in the other system with explicit memory management (i.e. in C++ you have explicit new/delete life cycle) in which case you really don't need anything more than Preserve. It is more interesting when you want to track the life of R objects without imposing the life span - i.e when you want to know when an object in R is collected such that you can delete it from the other system (i.e. you don't explicitly retain it by the reference). Cheers, Simon > > > HTH, > > > L. > > > > >> Thanks, >> >> Romain >> >> [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/ >> [2] >> http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup >> >> >> >> > -- Romain Francois Professional R
[Rd] inspect [Was: R-devel Digest, Vol 83, Issue 2]
On Jan 2, 2010, at 5:41 PM, Romain Francois wrote: > On 01/02/2010 11:12 PM, Duncan Murdoch wrote: >> >> On 02/01/2010 3:16 PM, Laurent Gautier wrote: >>> On 1/2/10 8:53 PM, Duncan Murdoch wrote: Simon Urbanek wrote: > On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: > >> On 1/2/10 5:56 PM, Duncan Murdoch wrote: >>> On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: >> (...) >> > I'd also be interested if there is some ideas on the relative > efficiency > of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack >>> UNPROTECT_PTR is also possible, which does a linear search through >>> the >>> stack and unprotects something possibly deep within it. There is also >>> REPROTECT which allows you to replace an entry within the stack. >>> >>> I would guess that UNPROTECT_PTR is more efficient than >>> RecursiveRelease >>> because it doesn't use so much stack space when it needs to go deep >>> into >>> the stack to release, but it is possible the compiler recognizes the >>> tail recursion and RecursiveRelease is implemented efficiently. In >>> that >>> case it could be more efficient than UNPROTECT_PTR, which has to move >>> all the other entries down to fill the newly vacated space. Really >>> the >>> only reliable way to answer efficiency questions like this is to try >>> both ways and see which works better in your application. >> Thanks. I did not know about UNPROTECT_PTR. >> I had concerns over the stack usage, but so far it did not prove too >> much of a problem. Still, why isn't the tail recursion explicitly >> made an iteration then ? This would take the "may be the compiler >> figures it out, may be not" variable out of the equation. >> > Careful - the protection stack (bookkeeping by R) has nothing to do > with the C function call stack hence it has nothing to do with the > compiler. The protection stack is global so usually you don't run out > of it unless something goes horribly wrong (=infinite loop). I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered. >>> >>> Yes, I was referring to RecursiveRelease. Sorry if this was not clear. >>> >>> What are the chances for a patch to be accepted ? At first sight(*), >>> making that tail recursion an iterative function is not a major >>> undertaking, and reviewing the patch be fairly straightforward... but >>> I can always use that time otherwise if the answer to the question is >>> "nil". >> >> I don't think I would want to review such a patch (I don't know the >> memory manager well, I don't know that there is really a case where it >> matters enough to be worth doing), so I'd say if you don't get a message >> from a core member volunteering to do so, you should assume it won't be >> accepted. But that doesn't mean you shouldn't write the code for your >> own internal use and edification, and if you can put together a demo >> that shows it really makes a big difference in a realistic situation, >> you might get a different response. >> >> Duncan Murdoch > > From what I understand, this has little to do with the memory manager, and > resumes to the simple problem "how to remove an object from a list". > > Something like this, using the amazing inline and inspect packages: > > require( inline ) > require( inspect ) > FWIW inspect is now part of R itself but as an internal function so you can either use it directly via .Internal or for compatibility inspect <- function(...) .Internal(inspect(...)) the arguments are (x, max.depth=-1, max.elements=5) [the last one is only supported in R-devel]. Cheers, Simon > remover <- cfunction(signature( list = "language", object = "environment" ), ' > if( !isNull( list ) ){ > SEXP x = list ; > SEXP y ; > while( CAR(x) != object && CADR(x) != R_NilValue ){ > y = x ; > x = CDR(x) ; > } >
[Rd] iterative list rm [Was: R-devel Digest, Vol 83, Issue 2]
On Jan 2, 2010, at 5:41 PM, Romain Francois wrote: > On 01/02/2010 11:12 PM, Duncan Murdoch wrote: >> >> On 02/01/2010 3:16 PM, Laurent Gautier wrote: >>> On 1/2/10 8:53 PM, Duncan Murdoch wrote: Simon Urbanek wrote: > On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: > >> On 1/2/10 5:56 PM, Duncan Murdoch wrote: >>> On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: >> (...) >> > I'd also be interested if there is some ideas on the relative > efficiency > of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack >>> UNPROTECT_PTR is also possible, which does a linear search through >>> the >>> stack and unprotects something possibly deep within it. There is also >>> REPROTECT which allows you to replace an entry within the stack. >>> >>> I would guess that UNPROTECT_PTR is more efficient than >>> RecursiveRelease >>> because it doesn't use so much stack space when it needs to go deep >>> into >>> the stack to release, but it is possible the compiler recognizes the >>> tail recursion and RecursiveRelease is implemented efficiently. In >>> that >>> case it could be more efficient than UNPROTECT_PTR, which has to move >>> all the other entries down to fill the newly vacated space. Really >>> the >>> only reliable way to answer efficiency questions like this is to try >>> both ways and see which works better in your application. >> Thanks. I did not know about UNPROTECT_PTR. >> I had concerns over the stack usage, but so far it did not prove too >> much of a problem. Still, why isn't the tail recursion explicitly >> made an iteration then ? This would take the "may be the compiler >> figures it out, may be not" variable out of the equation. >> > Careful - the protection stack (bookkeeping by R) has nothing to do > with the C function call stack hence it has nothing to do with the > compiler. The protection stack is global so usually you don't run out > of it unless something goes horribly wrong (=infinite loop). I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered. >>> >>> Yes, I was referring to RecursiveRelease. Sorry if this was not clear. >>> >>> What are the chances for a patch to be accepted ? At first sight(*), >>> making that tail recursion an iterative function is not a major >>> undertaking, and reviewing the patch be fairly straightforward... but >>> I can always use that time otherwise if the answer to the question is >>> "nil". >> >> I don't think I would want to review such a patch (I don't know the >> memory manager well, I don't know that there is really a case where it >> matters enough to be worth doing), so I'd say if you don't get a message >> from a core member volunteering to do so, you should assume it won't be >> accepted. But that doesn't mean you shouldn't write the code for your >> own internal use and edification, and if you can put together a demo >> that shows it really makes a big difference in a realistic situation, >> you might get a different response. >> >> Duncan Murdoch > > From what I understand, this has little to do with the memory manager, and > resumes to the simple problem "how to remove an object from a list". > > Something like this, using the amazing inline and inspect packages: > > require( inline ) > require( inspect ) > > remover <- cfunction(signature( list = "language", object = "environment" ), ' > if( !isNull( list ) ){ > SEXP x = list ; > SEXP y ; > while( CAR(x) != object && CADR(x) != R_NilValue ){ > y = x ; > x = CDR(x) ; > } > if( CAR(x) == object ) SETCDR(y, CDR(x) ) ; > } > return list ; That blows up if the object is the first in the list BTW. You probably meant more something like static SEXP ReleaseObj(SEXP object, SEXP list) { if (!isNull(list)) { if (CAR(list) != object) {
Re: [Rd] iterative list rm [Was: R-devel Digest, Vol 83, Issue 2]
On 01/03/2010 02:04 AM, Simon Urbanek wrote: On Jan 2, 2010, at 5:41 PM, Romain Francois wrote: On 01/02/2010 11:12 PM, Duncan Murdoch wrote: On 02/01/2010 3:16 PM, Laurent Gautier wrote: On 1/2/10 8:53 PM, Duncan Murdoch wrote: Simon Urbanek wrote: On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote: On 1/2/10 5:56 PM, Duncan Murdoch wrote: On 02/01/2010 11:36 AM, Laurent Gautier wrote: [Disclaimer: what is below reflects my understanding from reading the R source, others will correct where deemed necessary] On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote: (...) I'd also be interested if there is some ideas on the relative efficiency of the preserve/release mechanism compared to PROTECT/UNPROTECT. PROTECT/UNPROTECT is trading granularity for speed. It is a stack with only tow operations possible: - push 1 object into the stack - pull (unprotect) N last objects from the stack UNPROTECT_PTR is also possible, which does a linear search through the stack and unprotects something possibly deep within it. There is also REPROTECT which allows you to replace an entry within the stack. I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease because it doesn't use so much stack space when it needs to go deep into the stack to release, but it is possible the compiler recognizes the tail recursion and RecursiveRelease is implemented efficiently. In that case it could be more efficient than UNPROTECT_PTR, which has to move all the other entries down to fill the newly vacated space. Really the only reliable way to answer efficiency questions like this is to try both ways and see which works better in your application. Thanks. I did not know about UNPROTECT_PTR. I had concerns over the stack usage, but so far it did not prove too much of a problem. Still, why isn't the tail recursion explicitly made an iteration then ? This would take the "may be the compiler figures it out, may be not" variable out of the equation. Careful - the protection stack (bookkeeping by R) has nothing to do with the C function call stack hence it has nothing to do with the compiler. The protection stack is global so usually you don't run out of it unless something goes horribly wrong (=infinite loop). I think Laurent was referring to RecursiveRelease, which could use a lot of C stack if you choose to release something that is deep in the list, since it checks the head, and if that doesn't match, calls itself again on the rest of the list. (I checked, and at least one version of gcc doesn't recognize the tail recursion: it really does generate a recursive call.) Laurent asked why it isn't optimized to avoid the recursion: I think the answer is simply because it is so rarely used that nobody has bothered. Yes, I was referring to RecursiveRelease. Sorry if this was not clear. What are the chances for a patch to be accepted ? At first sight(*), making that tail recursion an iterative function is not a major undertaking, and reviewing the patch be fairly straightforward... but I can always use that time otherwise if the answer to the question is "nil". I don't think I would want to review such a patch (I don't know the memory manager well, I don't know that there is really a case where it matters enough to be worth doing), so I'd say if you don't get a message from a core member volunteering to do so, you should assume it won't be accepted. But that doesn't mean you shouldn't write the code for your own internal use and edification, and if you can put together a demo that shows it really makes a big difference in a realistic situation, you might get a different response. Duncan Murdoch From what I understand, this has little to do with the memory manager, and resumes to the simple problem "how to remove an object from a list". Something like this, using the amazing inline and inspect packages: require( inline ) require( inspect ) remover<- cfunction(signature( list = "language", object = "environment" ), ' if( !isNull( list ) ){ SEXP x = list ; SEXP y ; while( CAR(x) != object&& CADR(x) != R_NilValue ){ y = x ; x = CDR(x) ; } if( CAR(x) == object ) SETCDR(y, CDR(x) ) ; } return list ; That blows up if the object is the first in the list BTW. You probably meant more something like static SEXP ReleaseObj(SEXP object, SEXP list) { if (!isNull(list)) { if (CAR(list) != object) { SEXP c = list; while (!isNull(CDR(c))) { if (CADR(c) == object) { CDR(c) = CDDR(c); break; } c = CDR(c); } } else return CDR(list); } return list; }