[Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?

2010-01-02 Thread Romain Francois

Hello,

We are currently making lots of changes to Rcpp (see the open Rcpp 
mailing list if interested [1] in the details).


We are now using [2] R_PreserveObject and R_ReleaseObject to manage 
garbage collection instead of the PROTECT/UNPROTECT dance. This seems to 
work well, but I was wondering if there was documentation about it.


In particular, if we preserve the same SEXP twice (or more), should we 
implement some sort of reference counting ?


Reading the source (below, from memory.c) I think not, but some 
confirmation would help.


void R_PreserveObject(SEXP object)
{
R_PreciousList = CONS(object, R_PreciousList);
}

static SEXP RecursiveRelease(SEXP object, SEXP list)
{
if (!isNull(list)) {
if (object == CAR(list))
return CDR(list);
else
CDR(list) = RecursiveRelease(object, CDR(list));
}
return list;
}

void R_ReleaseObject(SEXP object)
{
R_PreciousList =  RecursiveRelease(object, R_PreciousList);
}


I'd also be interested if there is some ideas on the relative efficiency 
of the preserve/release mechanism compared to PROTECT/UNPROTECT.


Thanks,

Romain

[1] http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2] 
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/IW9B : C++ exceptions at the R level
|- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How x[, 'colname1'] is implemented?

2010-01-02 Thread Barry Rowlingson
On Fri, Jan 1, 2010 at 9:40 PM, Peng Yu  wrote:

> I'm not complaining that it is not documented.

 Yes but you didn't answer my question. When you ask a question on
these (or any mailing lists) you should always say what efforts you've
made to answer the question. I first looked in the help for '[' but
didn't find anything there. If you had already done that and told us
then I wouldn't have wasted my time. If you'd gone on to say you'd
also looked in the source code, and in which files, then I wouldn't
have wasted my time looking in those files. Note that this is all for
your benefit as well because we'll be able to help you quicker!

>>  As Obi-wan Kenobi may have said in Star Wars: "Use the source, Luke!":

> Could you explain what ns and nx represent?

 No, you need to take Obi-wan's advice (or Seth Falcon's, who got to
this before me) and 'Use the source'. I would hope that since you have
an understanding of what a hashing algorithm is that you should also
have some understanding of programming, and even if not in C it's not
too hard to figure out.

 You could make the effort to look in subscript.c and work it out for yourself.

 If you are very interested in the low-level working of R then you
should compile R with debugging turned on, then run R in a debugger
and set a breakpoint in the array subscript function. Then you can
inspect the C variables when you run it.

Barry

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] loess() crashes R on my system

2010-01-02 Thread Dennis Murphy
Greetings and happy new year!

I am in the process of converting some of the old S-PLUS scripts from
Visualizing Data (Cleveland, 1993)
into lattice. In fact, I did most of it several years ago, and at the time,
all of the scripts that contained
loess() worked fine. Tonight, I ran most of the scripts again, but every one
that I tried with a
loess() call crashed R. I tried it in two sessions, one with an existing
.Rdata and in another
directory without one. The call below, isolated from its script, is one of
the culprits:

with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2,
parametric = "C", drop.square = "C", family="s"))

In response to the Posting Guide, when I say 'crash' I mean that the system
abnormally terminated
R when this line or any other one that invokes loess() is run. I'm running
the binary version of 2.10.1
on a Dell Studio 15 with Windows 7 Home Premium.

This is the first time I've ever experienced something like this in the
decade that I've used R. What's
more surprising is that it occurred with such a common function. Since I
haven't seen any postings
of this type, it's probably something on my end, but I don't know what.
Since it shut down the
program, I felt compelled to report it.

I am subscribed to r-help but not r-devel, so please cc any response to me.
Thanks in advance for
any assistance.

Dennis Murphy

> sessionInfo()
R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.10.1

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] loess() crashes R on my system

2010-01-02 Thread Uwe Ligges



Dennis Murphy wrote:

Greetings and happy new year!

I am in the process of converting some of the old S-PLUS scripts from
Visualizing Data (Cleveland, 1993)
into lattice. In fact, I did most of it several years ago, and at the time,
all of the scripts that contained
loess() worked fine. Tonight, I ran most of the scripts again, but every one
that I tried with a
loess() call crashed R. I tried it in two sessions, one with an existing
.Rdata and in another
directory without one. The call below, isolated from its script, is one of
the culprits:

with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2,
parametric = "C", drop.square = "C", family="s"))



I just did

replicate(1000, with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2,
  parametric = "C", drop.square = "C", family="s")))

under Windows XP SP3 with R-2.10.1 / lattice_0.17-26 and it worked fine: 
Sorry, I cannot reproduce it.


You forgot to report your lattice version below.

Best,
Uwe Ligges





In response to the Posting Guide, when I say 'crash' I mean that the system
abnormally terminated
R when this line or any other one that invokes loess() is run. I'm running
the binary version of 2.10.1
on a Dell Studio 15 with Windows 7 Home Premium.

This is the first time I've ever experienced something like this in the
decade that I've used R. What's
more surprising is that it occurred with such a common function. Since I
haven't seen any postings
of this type, it's probably something on my end, but I don't know what.
Since it shut down the
program, I felt compelled to report it.

I am subscribed to r-help but not r-devel, so please cc any response to me.
Thanks in advance for
any assistance.

Dennis Murphy


sessionInfo()

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.10.1

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] loess() crashes R on my system

2010-01-02 Thread Peter Ehlers

Dennis,

I have no problems with this on my DELL laptop (still running
Vista) with R version 2.10.1 Patched (2009-12-21 r50796).

 -Peter Ehlers

Dennis Murphy wrote:

Greetings and happy new year!

I am in the process of converting some of the old S-PLUS scripts from
Visualizing Data (Cleveland, 1993)
into lattice. In fact, I did most of it several years ago, and at the time,
all of the scripts that contained
loess() worked fine. Tonight, I ran most of the scripts again, but every one
that I tried with a
loess() call crashed R. I tried it in two sessions, one with an existing
.Rdata and in another
directory without one. The call below, isolated from its script, is one of
the culprits:

with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2,
parametric = "C", drop.square = "C", family="s"))

In response to the Posting Guide, when I say 'crash' I mean that the system
abnormally terminated
R when this line or any other one that invokes loess() is run. I'm running
the binary version of 2.10.1
on a Dell Studio 15 with Windows 7 Home Premium.

This is the first time I've ever experienced something like this in the
decade that I've used R. What's
more surprising is that it occurred with such a common function. Since I
haven't seen any postings
of this type, it's probably something on my end, but I don't know what.
Since it shut down the
program, I felt compelled to report it.

I am subscribed to r-help but not r-devel, so please cc any response to me.
Thanks in advance for
any assistance.

Dennis Murphy


sessionInfo()

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.10.1

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Peter Ehlers
University of Calgary
403.202.3921

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] loess() crashes R on my system

2010-01-02 Thread Prof Brian Ripley
The posting guide also applies to R-devel: please provide a complete 
reproducible example and don't send HTML, as requested there.


On Fri, 1 Jan 2010, Dennis Murphy wrote:


Greetings and happy new year!

I am in the process of converting some of the old S-PLUS scripts 
from Visualizing Data (Cleveland, 1993) into lattice. In fact, I did 
most of it several years ago, and at the time, all of the scripts 
that contained loess() worked fine. Tonight, I ran most of the 
scripts again, but every one that I tried with a loess() call 
crashed R. I tried it in two sessions, one with an existing .Rdata 
and in another directory without one. The call below, isolated from 
its script, is one of the culprits:


with(ethanol, loess(NOx ~ C * E, span = 1/3, degree = 2,
   parametric = "C", drop.square = "C", family="s"))


What is 'ethanol'?  Is it the data set in package lattice, for 
example?  (If so, this works for me everywhere, including Windows XP.)


In response to the Posting Guide, when I say 'crash' I mean that the 
system abnormally terminated R when this line or any other one that 
invokes loess() is run. I'm running the binary version of 2.10.1 on 
a Dell Studio 15 with Windows 7 Home Premium.


If you mean any call to loess(), the problem is your machine (or at 
least your precise version of the OS) since example(loess) has been 
run on the binary version of R before distribution.  The examples are 
always a good place to start testing.)


Problems specific to the Windows binary (and we don't know that as yet 
but it seems likely) should be reported to the R-windows address.


This is the first time I've ever experienced something like this in 
the decade that I've used R. What's more surprising is that it 
occurred with such a common function. Since I haven't seen any 
postings of this type, it's probably something on my end, but I 
don't know what. Since it shut down the program, I felt compelled to 
report it.


You should be able to get a problem report under those circumstances, 
usually as an option on the dialog box -- and you will need to report 
that with the reproducible example.



I am subscribed to r-help but not r-devel, so please cc any response to me.
Thanks in advance for
any assistance.

Dennis Murphy


sessionInfo()

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.10.1

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Laurent Gautier
[Disclaimer: what is below reflects my understanding from reading the R 
source, others will correct where deemed necessary]


On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:


Hello,

We are currently making lots of changes to Rcpp (see the open Rcpp
mailing list if interested [1] in the details).

We are now using [2] R_PreserveObject and R_ReleaseObject to manage
garbage collection instead of the PROTECT/UNPROTECT dance. This seems to
work well, but I was wondering if there was documentation about it.


The most precise technical documentation is in memory.c
PROTECT is an alias for Rf_protect, itself an alias for
SEXP protect(SEXP s);
and uses a stack (R_PPStack) to store protected objects.


In particular, if we preserve the same SEXP twice (or more), should we
implement some sort of reference counting ?


This depends on the requirements for your system.

For example, in rpy2 I added a reference counting layer(*) because I 
wanted to allow several Python objects to share the same underlying R 
object, but that's not currently(*) counting how many times an object 
should be freed.
(*: imperfect, but currently doing a very decent job - details upon 
request).


That kind of feature could be provided by R's C-level API, since this 
could be seen of general use as well as give an opportunity to improve 
the performances of the R_PreservedObject/R_ReleaseObject duo whenever a 
lot of objects are protected and/or external code is 
protecting/releasing objects through a FIFO proxy.




Reading the source (below, from memory.c) I think not, but some
confirmation would help.


I understand the code in memory.c like an object preserved twice needs 
to be freed twice: R_PreciousList is just a (linked) list, and 
"R_PreserveObject(object)" adds the object to the beginning of the list 
while "R_ReleaseObject(object)" removes the first "object" found from 
the list.





void R_PreserveObject(SEXP object)
{
  R_PreciousList = CONS(object, R_PreciousList);
}

static SEXP RecursiveRelease(SEXP object, SEXP list)
{
  if (!isNull(list)) {
if (object == CAR(list))
return CDR(list);
else
CDR(list) = RecursiveRelease(object, CDR(list));
  }
  return list;
}

void R_ReleaseObject(SEXP object)
{
  R_PreciousList =  RecursiveRelease(object, R_PreciousList);
}


I'd also be interested if there is some ideas on the relative efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.


PROTECT/UNPROTECT is trading granularity for speed. It is a stack with 
only tow operations possible:

- push 1 object into the stack
- pull (unprotect) N last objects from the stack


HTH,


L.





Thanks,

Romain

[1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2]
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup

-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions
at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?

2010-01-02 Thread Romain Francois

Thanks.

On 01/02/2010 05:36 PM, Laurent Gautier wrote:


[Disclaimer: what is below reflects my understanding from reading the R
source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:


Hello,

We are currently making lots of changes to Rcpp (see the open Rcpp
mailing list if interested [1] in the details).

We are now using [2] R_PreserveObject and R_ReleaseObject to manage
garbage collection instead of the PROTECT/UNPROTECT dance. This seems to
work well, but I was wondering if there was documentation about it.


The most precise technical documentation is in memory.c
PROTECT is an alias for Rf_protect, itself an alias for
SEXP protect(SEXP s);
and uses a stack (R_PPStack) to store protected objects.


In particular, if we preserve the same SEXP twice (or more), should we
implement some sort of reference counting ?


This depends on the requirements for your system.


We wrap up SEXP into a C++ class that in particular manages itself 
preserving and releasing the object to garbage collection.



For example, in rpy2 I added a reference counting layer(*) because I
wanted to allow several Python objects to share the same underlying R
object, but that's not currently(*) counting how many times an object
should be freed.
(*: imperfect, but currently doing a very decent job - details upon
request).

That kind of feature could be provided by R's C-level API, since this
could be seen of general use as well as give an opportunity to improve
the performances of the R_PreservedObject/R_ReleaseObject duo whenever a
lot of objects are protected and/or external code is
protecting/releasing objects through a FIFO proxy.



Reading the source (below, from memory.c) I think not, but some
confirmation would help.


I understand the code in memory.c like an object preserved twice needs
to be freed twice: R_PreciousList is just a (linked) list, and
"R_PreserveObject(object)" adds the object to the beginning of the list
while "R_ReleaseObject(object)" removes the first "object" found from
the list.


That's what I understand of it too. It fits nicely into C++ assignment 
operator and copy constructor.



void R_PreserveObject(SEXP object)
{
R_PreciousList = CONS(object, R_PreciousList);
}

static SEXP RecursiveRelease(SEXP object, SEXP list)
{
if (!isNull(list)) {
if (object == CAR(list))
return CDR(list);
else
CDR(list) = RecursiveRelease(object, CDR(list));
}
return list;
}

void R_ReleaseObject(SEXP object)
{
R_PreciousList = RecursiveRelease(object, R_PreciousList);
}


I'd also be interested if there is some ideas on the relative efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.


PROTECT/UNPROTECT is trading granularity for speed. It is a stack with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack


HTH,


L.





Thanks,

Romain

[1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2]
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/IW9B : C++ exceptions at the R level
|- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Duncan Murdoch

On 02/01/2010 11:36 AM, Laurent Gautier wrote:
[Disclaimer: what is below reflects my understanding from reading the R 
source, others will correct where deemed necessary]


On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:

Hello,

We are currently making lots of changes to Rcpp (see the open Rcpp
mailing list if interested [1] in the details).

We are now using [2] R_PreserveObject and R_ReleaseObject to manage
garbage collection instead of the PROTECT/UNPROTECT dance. This seems to
work well, but I was wondering if there was documentation about it.


The most precise technical documentation is in memory.c
PROTECT is an alias for Rf_protect, itself an alias for
SEXP protect(SEXP s);
and uses a stack (R_PPStack) to store protected objects.


In particular, if we preserve the same SEXP twice (or more), should we
implement some sort of reference counting ?


This depends on the requirements for your system.

For example, in rpy2 I added a reference counting layer(*) because I 
wanted to allow several Python objects to share the same underlying R 
object, but that's not currently(*) counting how many times an object 
should be freed.
(*: imperfect, but currently doing a very decent job - details upon 
request).


That kind of feature could be provided by R's C-level API, since this 
could be seen of general use as well as give an opportunity to improve 
the performances of the R_PreservedObject/R_ReleaseObject duo whenever a 
lot of objects are protected and/or external code is 
protecting/releasing objects through a FIFO proxy.




Reading the source (below, from memory.c) I think not, but some
confirmation would help.


I understand the code in memory.c like an object preserved twice needs 
to be freed twice: R_PreciousList is just a (linked) list, and 
"R_PreserveObject(object)" adds the object to the beginning of the list 
while "R_ReleaseObject(object)" removes the first "object" found from 
the list.





void R_PreserveObject(SEXP object)
{
  R_PreciousList = CONS(object, R_PreciousList);
}

static SEXP RecursiveRelease(SEXP object, SEXP list)
{
  if (!isNull(list)) {
if (object == CAR(list))
return CDR(list);
else
CDR(list) = RecursiveRelease(object, CDR(list));
  }
  return list;
}

void R_ReleaseObject(SEXP object)
{
  R_PreciousList =  RecursiveRelease(object, R_PreciousList);
}


I'd also be interested if there is some ideas on the relative efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.


PROTECT/UNPROTECT is trading granularity for speed. It is a stack with 
only tow operations possible:

- push 1 object into the stack
- pull (unprotect) N last objects from the stack


UNPROTECT_PTR is also possible, which does a linear search through the 
stack and unprotects something possibly deep within it.  There is also 
REPROTECT which allows you to replace an entry within the stack.


I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease 
because it doesn't use so much stack space when it needs to go deep into 
the stack to release, but it is possible the compiler recognizes the 
tail recursion and RecursiveRelease is implemented efficiently.  In that 
case it could be more efficient than UNPROTECT_PTR, which has to move 
all the other entries down to fill the newly vacated space.  Really the 
only reliable way to answer efficiency questions like this is to try

both ways and see which works better in your application.

Another possibility is to maintain your own list or environment of 
objects, and just protect/preserve the list as a whole.


Duncan Murdoch




HTH,


L.





Thanks,

Romain

[1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2]
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup

-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions
at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?

2010-01-02 Thread Simon Urbanek

On Jan 2, 2010, at 5:07 AM, Romain Francois wrote:

> Hello,
> 
> We are currently making lots of changes to Rcpp (see the open Rcpp mailing 
> list if interested [1] in the details).
> 
> We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage 
> collection instead of the PROTECT/UNPROTECT dance. This seems to work well, 
> but I was wondering if there was documentation about it.
> 

I don't think so - the only documentation is the comment in the source.


> In particular, if we preserve the same SEXP twice (or more), should we 
> implement some sort of reference counting ?
> 

Preserve/Release are for managing objects that are supposed to survive past the 
call and are not tied to any other R object. PROTECT/UNPROTECT are for 
temporary preservation within a call.

Although you're right that Preserve/Release is effectively implemented as a 
stack at the moment it is not stated explicitly anywhere (this goes all the way 
back to R 0.64 so chances are that only Ross can comment..). However, for 
practical purposes it would be potentially dangerous to have it work like a 
flag because you can simply never know whether the same object was not already 
registered by some other code.


> Reading the source (below, from memory.c) I think not, but some confirmation 
> would help.
> 
> void R_PreserveObject(SEXP object)
> {
>R_PreciousList = CONS(object, R_PreciousList);
> }
> 
> static SEXP RecursiveRelease(SEXP object, SEXP list)
> {
>if (!isNull(list)) {
>   if (object == CAR(list))
>   return CDR(list);
>   else
>   CDR(list) = RecursiveRelease(object, CDR(list));
>}
>return list;
> }
> 
> void R_ReleaseObject(SEXP object)
> {
>R_PreciousList =  RecursiveRelease(object, R_PreciousList);
> }
> 
> 
> I'd also be interested if there is some ideas on the relative efficiency of 
> the preserve/release mechanism compared to PROTECT/UNPROTECT.
> 

PROTECT/UNPROTECT is more efficient because all it does is a pointer assignment 
-- Preserve has to allocate new node and fill it with all parts. On release the 
extra node is still floating in the GC pool etc.

Normally there is not really a question of choice - within a call you want to 
use PROTET/UNPROTECT and for anything else you simply cannot use it so you have 
to use Preserve/Release. As a side note Preserve/Release is merely a 
convenience call, it is often more efficient to simply assign the object to 
another object you have control of (which is all Preserve really does).

Cheers,
Simon

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Laurent Gautier

On 1/2/10 5:56 PM, Duncan Murdoch wrote:

On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from reading the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:


(...)



I'd also be interested if there is some ideas on the relative efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.


PROTECT/UNPROTECT is trading granularity for speed. It is a stack with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack


UNPROTECT_PTR is also possible, which does a linear search through the
stack and unprotects something possibly deep within it. There is also
REPROTECT which allows you to replace an entry within the stack.

>

I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease
because it doesn't use so much stack space when it needs to go deep into
the stack to release, but it is possible the compiler recognizes the
tail recursion and RecursiveRelease is implemented efficiently. In that
case it could be more efficient than UNPROTECT_PTR, which has to move
all the other entries down to fill the newly vacated space. Really the
only reliable way to answer efficiency questions like this is to try
both ways and see which works better in your application.


Thanks. I did not know about UNPROTECT_PTR.
I had concerns over the stack usage, but so far it did not prove too 
much of a problem. Still, why isn't the tail recursion explicitly made 
an iteration then ? This would take the "may be the compiler figures it 
out, may be not" variable out of the equation.



Another possibility is to maintain your own list or environment of
objects, and just protect/preserve the list as a whole.


Interesting idea, this would let one perform his/her own bookkeeping on 
the list/environment. How is R garbage collection checking contained 
objects ? (I am thinking performances here, and may be cyclic references).




L.



Duncan Murdoch




HTH,


L.





Thanks,

Romain

[1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2]
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup


-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions
at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?

2010-01-02 Thread Romain Francois

On 01/02/2010 06:01 PM, Simon Urbanek wrote:



On Jan 2, 2010, at 5:07 AM, Romain Francois wrote:


Hello,

We are currently making lots of changes to Rcpp (see the open Rcpp mailing list 
if interested [1] in the details).

We are now using [2] R_PreserveObject and R_ReleaseObject to manage garbage 
collection instead of the PROTECT/UNPROTECT dance. This seems to work well, but 
I was wondering if there was documentation about it.



I don't think so - the only documentation is the comment in the source.



Fair enough. FWIW, some mention of it in the R-ints or R-exts could be 
valuable.




In particular, if we preserve the same SEXP twice (or more), should we 
implement some sort of reference counting ?



Preserve/Release are for managing objects that are supposed to survive past the 
call and are not tied to any other R object. PROTECT/UNPROTECT are for 
temporary preservation within a call.

Although you're right that Preserve/Release is effectively implemented as a 
stack at the moment it is not stated explicitly anywhere (this goes all the way 
back to R 0.64 so chances are that only Ross can comment..). However, for 
practical purposes it would be potentially dangerous to have it work like a 
flag because you can simply never know whether the same object was not already 
registered by some other code.



Reading the source (below, from memory.c) I think not, but some confirmation 
would help.

void R_PreserveObject(SEXP object)
{
R_PreciousList = CONS(object, R_PreciousList);
}

static SEXP RecursiveRelease(SEXP object, SEXP list)
{
if (!isNull(list)) {
if (object == CAR(list))
return CDR(list);
else
CDR(list) = RecursiveRelease(object, CDR(list));
}
return list;
}

void R_ReleaseObject(SEXP object)
{
R_PreciousList =  RecursiveRelease(object, R_PreciousList);
}


I'd also be interested if there is some ideas on the relative efficiency of the 
preserve/release mechanism compared to PROTECT/UNPROTECT.



PROTECT/UNPROTECT is more efficient because all it does is a pointer assignment 
-- Preserve has to allocate new node and fill it with all parts. On release the 
extra node is still floating in the GC pool etc.

Normally there is not really a question of choice - within a call you want to 
use PROTET/UNPROTECT and for anything else you simply cannot use it so you have 
to use Preserve/Release. As a side note Preserve/Release is merely a 
convenience call, it is often more efficient to simply assign the object to 
another object you have control of (which is all Preserve really does).

Cheers,
Simon


Thanks for this advice. This will make it easier to do the bookkeeping 
of how many objects we preserve, etc ...


Romain

--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/IW9B : C++ exceptions at the R level
|- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Romain Francois

On 01/02/2010 05:56 PM, Duncan Murdoch wrote:


On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from reading the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:

Hello,

We are currently making lots of changes to Rcpp (see the open Rcpp
mailing list if interested [1] in the details).

We are now using [2] R_PreserveObject and R_ReleaseObject to manage
garbage collection instead of the PROTECT/UNPROTECT dance. This seems to
work well, but I was wondering if there was documentation about it.


The most precise technical documentation is in memory.c
PROTECT is an alias for Rf_protect, itself an alias for
SEXP protect(SEXP s);
and uses a stack (R_PPStack) to store protected objects.


In particular, if we preserve the same SEXP twice (or more), should we
implement some sort of reference counting ?


This depends on the requirements for your system.

For example, in rpy2 I added a reference counting layer(*) because I
wanted to allow several Python objects to share the same underlying R
object, but that's not currently(*) counting how many times an object
should be freed.
(*: imperfect, but currently doing a very decent job - details upon
request).

That kind of feature could be provided by R's C-level API, since this
could be seen of general use as well as give an opportunity to improve
the performances of the R_PreservedObject/R_ReleaseObject duo whenever
a lot of objects are protected and/or external code is
protecting/releasing objects through a FIFO proxy.



Reading the source (below, from memory.c) I think not, but some
confirmation would help.


I understand the code in memory.c like an object preserved twice needs
to be freed twice: R_PreciousList is just a (linked) list, and
"R_PreserveObject(object)" adds the object to the beginning of the
list while "R_ReleaseObject(object)" removes the first "object" found
from the list.




void R_PreserveObject(SEXP object)
{
R_PreciousList = CONS(object, R_PreciousList);
}

static SEXP RecursiveRelease(SEXP object, SEXP list)
{
if (!isNull(list)) {
if (object == CAR(list))
return CDR(list);
else
CDR(list) = RecursiveRelease(object, CDR(list));
}
return list;
}

void R_ReleaseObject(SEXP object)
{
R_PreciousList = RecursiveRelease(object, R_PreciousList);
}


I'd also be interested if there is some ideas on the relative efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.


PROTECT/UNPROTECT is trading granularity for speed. It is a stack with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack


UNPROTECT_PTR is also possible, which does a linear search through the
stack and unprotects something possibly deep within it. There is also
REPROTECT which allows you to replace an entry within the stack.


Thanks; I've used those when I played with the parser's code (for 
highlight). It felt slightly harder to use than preserve/release since 
you have to maintain the protect index, etc ...


I think I'll go as you say below, just maintain my own precious list, 
the way Preserve/Release does it.


Romain


I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease
because it doesn't use so much stack space when it needs to go deep into
the stack to release, but it is possible the compiler recognizes the
tail recursion and RecursiveRelease is implemented efficiently. In that
case it could be more efficient than UNPROTECT_PTR, which has to move
all the other entries down to fill the newly vacated space. Really the
only reliable way to answer efficiency questions like this is to try
both ways and see which works better in your application.

Another possibility is to maintain your own list or environment of
objects, and just protect/preserve the list as a whole.

Duncan Murdoch


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/IW9B : C++ exceptions at the R level
|- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?

2010-01-02 Thread Laurent Gautier

On 1/2/10 5:50 PM, Romain Francois wrote:

Thanks.

On 01/02/2010 05:36 PM, Laurent Gautier wrote:


[Disclaimer: what is below reflects my understanding from reading the R
source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:


(...)




In particular, if we preserve the same SEXP twice (or more), should we
implement some sort of reference counting ?


This depends on the requirements for your system.


We wrap up SEXP into a C++ class that in particular manages itself
preserving and releasing the object to garbage collection.


If you do not allow several C++ instances to share the same SEXP, you 
should not need reference counting. In the case of rpy2, this was made 
necessary for allowing inheritance (as nested calls to constructors can 
cause a transient sharing of the given SEXP across several Python objects).



(...)



L.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Simon Urbanek

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:

> On 1/2/10 5:56 PM, Duncan Murdoch wrote:
>> On 02/01/2010 11:36 AM, Laurent Gautier wrote:
>>> [Disclaimer: what is below reflects my understanding from reading the
>>> R source, others will correct where deemed necessary]
>>> 
>>> On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:
> 
> (...)
> 
 
 I'd also be interested if there is some ideas on the relative efficiency
 of the preserve/release mechanism compared to PROTECT/UNPROTECT.
>>> 
>>> PROTECT/UNPROTECT is trading granularity for speed. It is a stack with
>>> only tow operations possible:
>>> - push 1 object into the stack
>>> - pull (unprotect) N last objects from the stack
>> 
>> UNPROTECT_PTR is also possible, which does a linear search through the
>> stack and unprotects something possibly deep within it. There is also
>> REPROTECT which allows you to replace an entry within the stack.
> >
>> I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease
>> because it doesn't use so much stack space when it needs to go deep into
>> the stack to release, but it is possible the compiler recognizes the
>> tail recursion and RecursiveRelease is implemented efficiently. In that
>> case it could be more efficient than UNPROTECT_PTR, which has to move
>> all the other entries down to fill the newly vacated space. Really the
>> only reliable way to answer efficiency questions like this is to try
>> both ways and see which works better in your application.
> 
> Thanks. I did not know about UNPROTECT_PTR.
> I had concerns over the stack usage, but so far it did not prove too much of 
> a problem. Still, why isn't the tail recursion explicitly made an iteration 
> then ? This would take the "may be the compiler figures it out, may be not" 
> variable out of the equation.
> 

Careful - the protection stack (bookkeeping by R) has nothing to do with the C 
function call stack hence it has nothing to do with the compiler. The 
protection stack is global so usually you don't run out of it unless something 
goes horribly wrong (=infinite loop).


>> Another possibility is to maintain your own list or environment of
>> objects, and just protect/preserve the list as a whole.
> 
> Interesting idea, this would let one perform his/her own bookkeeping on the 
> list/environment. How is R garbage collection checking contained objects ? (I 
> am thinking performances here, and may be cyclic references).
> 

You don't really want to care because the GC is global for all objects (and 
cycles are supported by the GC in R) - so whether you keep it yourself or 
Preserve/Release is practically irrelevant (the protection stack is handled 
separately).

As for keeping your own list -- if you really care about performance that much 
(to be honest in vast majority of cases it doesn't matter) you have to be 
careful how you implement it. Technically the fastest way is preallocated 
generic vector but it really depends on how you deal with the access so you can 
easily perform worse than Preserve/Release if you're not careful.

As a side note - the best way (IMHO) to deal with all those issues is to use 
external pointers because a) you get very efficient C finalizers b) you can 
directly (and very efficiently) tie lifespan of other objects to the same SEXP 
and c) as guardians they can nicely track other objects that hold them.

Cheers,
Simon



> 
> 
> L.
> 
> 
>> Duncan Murdoch
>> 
>>> 
>>> 
>>> HTH,
>>> 
>>> 
>>> L.
>>> 
>>> 
>>> 
>>> 
 Thanks,
 
 Romain
 
 [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
 [2]
 http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup
 
 
 -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30
 http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions
 at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects
 `- http://tr.im/HlX9 : new package : bibtex
>>> 
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Duncan Murdoch

Simon Urbanek wrote:

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:

  

On 1/2/10 5:56 PM, Duncan Murdoch wrote:


On 02/01/2010 11:36 AM, Laurent Gautier wrote:
  

[Disclaimer: what is below reflects my understanding from reading the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:


(...)



I'd also be interested if there is some ideas on the relative efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.
  

PROTECT/UNPROTECT is trading granularity for speed. It is a stack with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack


UNPROTECT_PTR is also possible, which does a linear search through the
stack and unprotects something possibly deep within it. There is also
REPROTECT which allows you to replace an entry within the stack.

I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease
because it doesn't use so much stack space when it needs to go deep into
the stack to release, but it is possible the compiler recognizes the
tail recursion and RecursiveRelease is implemented efficiently. In that
case it could be more efficient than UNPROTECT_PTR, which has to move
all the other entries down to fill the newly vacated space. Really the
only reliable way to answer efficiency questions like this is to try
both ways and see which works better in your application.
  

Thanks. I did not know about UNPROTECT_PTR.
I had concerns over the stack usage, but so far it did not prove too much of a problem. 
Still, why isn't the tail recursion explicitly made an iteration then ? This would take 
the "may be the compiler figures it out, may be not" variable out of the 
equation.




Careful - the protection stack (bookkeeping by R) has nothing to do with the C 
function call stack hence it has nothing to do with the compiler. The 
protection stack is global so usually you don't run out of it unless something 
goes horribly wrong (=infinite loop).
  


I think Laurent was referring to RecursiveRelease, which could use a lot 
of C stack if you choose to release something that is deep in the list, 
since it checks the head, and if that doesn't match, calls itself again 
on the rest of the list.  (I checked, and at least one version of gcc 
doesn't recognize the tail recursion:  it really does generate a 
recursive call.)


Laurent asked why it isn't optimized to avoid the recursion:  I think 
the answer is simply because it is so rarely used that nobody has bothered.


Duncan Murdoch


  

Another possibility is to maintain your own list or environment of
objects, and just protect/preserve the list as a whole.
  

Interesting idea, this would let one perform his/her own bookkeeping on the 
list/environment. How is R garbage collection checking contained objects ? (I 
am thinking performances here, and may be cyclic references).




You don't really want to care because the GC is global for all objects (and 
cycles are supported by the GC in R) - so whether you keep it yourself or 
Preserve/Release is practically irrelevant (the protection stack is handled 
separately).

As for keeping your own list -- if you really care about performance that much 
(to be honest in vast majority of cases it doesn't matter) you have to be 
careful how you implement it. Technically the fastest way is preallocated 
generic vector but it really depends on how you deal with the access so you can 
easily perform worse than Preserve/Release if you're not careful.

As a side note - the best way (IMHO) to deal with all those issues is to use 
external pointers because a) you get very efficient C finalizers b) you can 
directly (and very efficiently) tie lifespan of other objects to the same SEXP 
and c) as guardians they can nicely track other objects that hold them.

Cheers,
Simon



  

L.




Duncan Murdoch

  

HTH,


L.






Thanks,

Romain

[1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2]
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup


-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ exceptions
at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex
  

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel








__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread luke

On Sat, 2 Jan 2010, Duncan Murdoch wrote:


Simon Urbanek wrote:

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:



On 1/2/10 5:56 PM, Duncan Murdoch wrote:


On 02/01/2010 11:36 AM, Laurent Gautier wrote:


[Disclaimer: what is below reflects my understanding from reading the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:


(...)


I'd also be interested if there is some ideas on the relative 
efficiency

of the preserve/release mechanism compared to PROTECT/UNPROTECT.


PROTECT/UNPROTECT is trading granularity for speed. It is a stack with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack


UNPROTECT_PTR is also possible, which does a linear search through the
stack and unprotects something possibly deep within it. There is also
REPROTECT which allows you to replace an entry within the stack.

I would guess that UNPROTECT_PTR is more efficient than RecursiveRelease
because it doesn't use so much stack space when it needs to go deep into
the stack to release, but it is possible the compiler recognizes the
tail recursion and RecursiveRelease is implemented efficiently. In that
case it could be more efficient than UNPROTECT_PTR, which has to move
all the other entries down to fill the newly vacated space. Really the
only reliable way to answer efficiency questions like this is to try
both ways and see which works better in your application.


Thanks. I did not know about UNPROTECT_PTR.
I had concerns over the stack usage, but so far it did not prove too much 
of a problem. Still, why isn't the tail recursion explicitly made an 
iteration then ? This would take the "may be the compiler figures it out, 
may be not" variable out of the equation.





Careful - the protection stack (bookkeeping by R) has nothing to do with 
the C function call stack hence it has nothing to do with the compiler. The 
protection stack is global so usually you don't run out of it unless 
something goes horribly wrong (=infinite loop).




I think Laurent was referring to RecursiveRelease, which could use a lot of C 
stack if you choose to release something that is deep in the list, since it 
checks the head, and if that doesn't match, calls itself again on the rest of 
the list.  (I checked, and at least one version of gcc doesn't recognize the 
tail recursion:  it really does generate a recursive call.)


I believe current gcc -O2 does optimize tail recursive calls (more
generally sibling calls) but RecursiveRelease isn't written as a tail
recursion -- the return vaule is used in an assignment.  In any case
it would probably make more sense to rewrite RecursiveRelease to not
use recursion at all, but I'm not motivated to do that at this point.

luke



Laurent asked why it isn't optimized to avoid the recursion:  I think the 
answer is simply because it is so rarely used that nobody has bothered.


Duncan Murdoch




Another possibility is to maintain your own list or environment of
objects, and just protect/preserve the list as a whole.

Interesting idea, this would let one perform his/her own bookkeeping on 
the list/environment. How is R garbage collection checking contained 
objects ? (I am thinking performances here, and may be cyclic references).





You don't really want to care because the GC is global for all objects (and 
cycles are supported by the GC in R) - so whether you keep it yourself or 
Preserve/Release is practically irrelevant (the protection stack is handled 
separately).


As for keeping your own list -- if you really care about performance that 
much (to be honest in vast majority of cases it doesn't matter) you have to 
be careful how you implement it. Technically the fastest way is 
preallocated generic vector but it really depends on how you deal with the 
access so you can easily perform worse than Preserve/Release if you're not 
careful.


As a side note - the best way (IMHO) to deal with all those issues is to 
use external pointers because a) you get very efficient C finalizers b) you 
can directly (and very efficiently) tie lifespan of other objects to the 
same SEXP and c) as guardians they can nicely track other objects that hold 
them.


Cheers,
Simon





L.




Duncan Murdoch



HTH,


L.






Thanks,

Romain

[1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2]
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup


-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++ 
exceptions

at the R level |- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/m

Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Laurent Gautier

On 1/2/10 8:53 PM, Duncan Murdoch wrote:

Simon Urbanek wrote:

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:


On 1/2/10 5:56 PM, Duncan Murdoch wrote:

On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from reading the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:

(...)


I'd also be interested if there is some ideas on the relative
efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.

PROTECT/UNPROTECT is trading granularity for speed. It is a stack with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack

UNPROTECT_PTR is also possible, which does a linear search through the
stack and unprotects something possibly deep within it. There is also
REPROTECT which allows you to replace an entry within the stack.

I would guess that UNPROTECT_PTR is more efficient than
RecursiveRelease
because it doesn't use so much stack space when it needs to go deep
into
the stack to release, but it is possible the compiler recognizes the
tail recursion and RecursiveRelease is implemented efficiently. In that
case it could be more efficient than UNPROTECT_PTR, which has to move
all the other entries down to fill the newly vacated space. Really the
only reliable way to answer efficiency questions like this is to try
both ways and see which works better in your application.

Thanks. I did not know about UNPROTECT_PTR.
I had concerns over the stack usage, but so far it did not prove too
much of a problem. Still, why isn't the tail recursion explicitly
made an iteration then ? This would take the "may be the compiler
figures it out, may be not" variable out of the equation.



Careful - the protection stack (bookkeeping by R) has nothing to do
with the C function call stack hence it has nothing to do with the
compiler. The protection stack is global so usually you don't run out
of it unless something goes horribly wrong (=infinite loop).


I think Laurent was referring to RecursiveRelease, which could use a lot
of C stack if you choose to release something that is deep in the list,
since it checks the head, and if that doesn't match, calls itself again
on the rest of the list. (I checked, and at least one version of gcc
doesn't recognize the tail recursion: it really does generate a
recursive call.)

Laurent asked why it isn't optimized to avoid the recursion: I think the
answer is simply because it is so rarely used that nobody has bothered.


Yes, I was referring to RecursiveRelease. Sorry if this was not clear.

What are the chances for a patch to be accepted ? At first sight(*), 
making that tail recursion an iterative function is not a major 
undertaking, and reviewing the patch be fairly straightforward... but I 
can always use that time otherwise if the answer to the question is "nil".



L.




Duncan Murdoch



Another possibility is to maintain your own list or environment of
objects, and just protect/preserve the list as a whole.

Interesting idea, this would let one perform his/her own bookkeeping
on the list/environment. How is R garbage collection checking
contained objects ? (I am thinking performances here, and may be
cyclic references).



You don't really want to care because the GC is global for all objects
(and cycles are supported by the GC in R) - so whether you keep it
yourself or Preserve/Release is practically irrelevant (the protection
stack is handled separately).

As for keeping your own list -- if you really care about performance
that much (to be honest in vast majority of cases it doesn't matter)
you have to be careful how you implement it. Technically the fastest
way is preallocated generic vector but it really depends on how you
deal with the access so you can easily perform worse than
Preserve/Release if you're not careful.

As a side note - the best way (IMHO) to deal with all those issues is
to use external pointers because a) you get very efficient C
finalizers b) you can directly (and very efficiently) tie lifespan of
other objects to the same SEXP and c) as guardians they can nicely
track other objects that hold them.

Cheers,
Simon




L.



Duncan Murdoch


HTH,


L.





Thanks,

Romain

[1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2]
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup



-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++
exceptions
at the R level |- http://tr.im/IlMh : CPP package: exposing C++
objects
`- http://tr.im/HlX9 : new package : bibtex

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel








__

Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Laurent Gautier

On 1/2/10 8:28 PM, Simon Urbanek wrote:


On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:


On 1/2/10 5:56 PM, Duncan Murdoch wrote:

On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from
reading the R source, others will correct where deemed
necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:



(...)


Another possibility is to maintain your own list or environment
of objects, and just protect/preserve the list as a whole.


Interesting idea, this would let one perform his/her own
bookkeeping on the list/environment. How is R garbage collection
checking contained objects ? (I am thinking performances here, and
may be cyclic references).



You don't really want to care because the GC is global for all
objects (and cycles are supported by the GC in R) - so whether you
keep it yourself or Preserve/Release is practically irrelevant (the
protection stack is handled separately).


I guess that I'll have to know in order to understand that I don't 
really want to care. ;-)
The garbage collector must somehow know if an object is available for 
collection (and will have to check whether an object is PROTECTed or 
not, or Preserved or not).
I suppose that upon being called the garbage collector will first look 
into the PROTECTed and Preserved objects, mark them as unavailble for 
collection, then recursively mark objects contained in them.



As for keeping your own list -- if you really care about performance
that much (to be honest in vast majority of cases it doesn't matter)
you have to be careful how you implement it. Technically the fastest
way is preallocated generic vector but it really depends on how you
deal with the access so you can easily perform worse than
Preserve/Release if you're not careful.


Releasing being of linear complexity, having few thousands of Preserved 
objects not being anticipated as an extraordinary situation, and 
Preserve/Release cycles being quite frequent, I start minding a bit 
about the performance. Keeping my own list would let me experiment with 
various strategies (and eventually offer



As a side note - the best way (IMHO) to deal with all those issues is
to use external pointers because a) you get very efficient C
finalizers b) you can directly (and very efficiently) tie lifespan of
other objects to the same SEXP and c) as guardians they can nicely
track other objects that hold them.


Thanks. I am not certain to follow everything. Are you suggesting that 
rather than Preserve-ing/Release-ing a list/environment that would act 
as a guardian for several objects, one should use an external pointer 
(to an arbitrary C pointer) ? In that case, how does one indicate that 
an external pointer acts as a container ?


Or are you suggesting that rather than Preserve-in/Release-ing R objects 
one should use an external pointer acting as a proxy for a SEXP 
(argument "prot" in R_MakeExternalPtr(void *p, SEXP tag, SEXP prot) ) ?
(but in that case the external pointer will itself have to go through 
Preserve/Release cycles...)



Cheers,


Laurent


Cheers, Simon






L.



Duncan Murdoch




HTH,


L.





Thanks,

Romain

[1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2]
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup





-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30

http://romainfrancois.blog.free.fr |- http://tr.im/IW9B : C++
exceptions at the R level |- http://tr.im/IlMh : CPP package:
exposing C++ objects `- http://tr.im/HlX9 : new package :
bibtex


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?

2010-01-02 Thread Whit Armstrong
Romain,

Is the use of UNPROTECT_PTR discouraged?  I wonder why you haven't
considered using it instead.

I have a similar project that uses a ref counting scheme and handles
the deletion of the shared object with UNPROTECT_PTR.  This method has
been working fine, but if there are reasons I should not be using it,
I would certainly like to know.

>From memory.c: /* "unprotect_ptr" remove pointer from somewhere in R_PPStack */

The code is very simple, it just walks backwards down the list until
it hits your object.  Unless you are adding an obscene number of items
to the protection stack in a given call, finding your object should be
very fast.

My example is here:
http://github.com/armstrtw/rabstraction/blob/master/R.backend.hpp

template
  Rbackend::~Rbackend() {
if(release_data_) {
  if(R_object_!=R_NilValue) {
UNPROTECT_PTR(R_object_);
  }
}
  }

I still think all these Rcpp projects should come together some day.
Dirk and I talked about combining about two years ago, but "real" work
got it the way.  I'll ping you off list for a follow up.

-Whit


On Sat, Jan 2, 2010 at 12:24 PM, Romain Francois
 wrote:
> On 01/02/2010 06:01 PM, Simon Urbanek wrote:
>>
>>
>> On Jan 2, 2010, at 5:07 AM, Romain Francois wrote:
>>
>>> Hello,
>>>
>>> We are currently making lots of changes to Rcpp (see the open Rcpp
>>> mailing list if interested [1] in the details).
>>>
>>> We are now using [2] R_PreserveObject and R_ReleaseObject to manage
>>> garbage collection instead of the PROTECT/UNPROTECT dance. This seems to
>>> work well, but I was wondering if there was documentation about it.
>>>
>>
>> I don't think so - the only documentation is the comment in the source.
>
>
> Fair enough. FWIW, some mention of it in the R-ints or R-exts could be
> valuable.
>
>
>>> In particular, if we preserve the same SEXP twice (or more), should we
>>> implement some sort of reference counting ?
>>>
>>
>> Preserve/Release are for managing objects that are supposed to survive
>> past the call and are not tied to any other R object. PROTECT/UNPROTECT are
>> for temporary preservation within a call.
>>
>> Although you're right that Preserve/Release is effectively implemented as
>> a stack at the moment it is not stated explicitly anywhere (this goes all
>> the way back to R 0.64 so chances are that only Ross can comment..).
>> However, for practical purposes it would be potentially dangerous to have it
>> work like a flag because you can simply never know whether the same object
>> was not already registered by some other code.
>>
>>
>>> Reading the source (below, from memory.c) I think not, but some
>>> confirmation would help.
>>>
>>> void R_PreserveObject(SEXP object)
>>> {
>>>    R_PreciousList = CONS(object, R_PreciousList);
>>> }
>>>
>>> static SEXP RecursiveRelease(SEXP object, SEXP list)
>>> {
>>>    if (!isNull(list)) {
>>>        if (object == CAR(list))
>>>            return CDR(list);
>>>        else
>>>            CDR(list) = RecursiveRelease(object, CDR(list));
>>>    }
>>>    return list;
>>> }
>>>
>>> void R_ReleaseObject(SEXP object)
>>> {
>>>    R_PreciousList =  RecursiveRelease(object, R_PreciousList);
>>> }
>>>
>>>
>>> I'd also be interested if there is some ideas on the relative efficiency
>>> of the preserve/release mechanism compared to PROTECT/UNPROTECT.
>>>
>>
>> PROTECT/UNPROTECT is more efficient because all it does is a pointer
>> assignment -- Preserve has to allocate new node and fill it with all parts.
>> On release the extra node is still floating in the GC pool etc.
>>
>> Normally there is not really a question of choice - within a call you want
>> to use PROTET/UNPROTECT and for anything else you simply cannot use it so
>> you have to use Preserve/Release. As a side note Preserve/Release is merely
>> a convenience call, it is often more efficient to simply assign the object
>> to another object you have control of (which is all Preserve really does).
>>
>> Cheers,
>> Simon
>
> Thanks for this advice. This will make it easier to do the bookkeeping of
> how many objects we preserve, etc ...
>
> Romain
>
> --
> Romain Francois
> Professional R Enthusiast
> +33(0) 6 28 91 30 30
> http://romainfrancois.blog.free.fr
> |- http://tr.im/IW9B : C++ exceptions at the R level
> |- http://tr.im/IlMh : CPP package: exposing C++ objects
> `- http://tr.im/HlX9 : new package : bibtex
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?

2010-01-02 Thread Dirk Eddelbuettel

On 2 January 2010 at 16:26, Whit Armstrong wrote:
| I still think all these Rcpp projects should come together some day.
| Dirk and I talked about combining about two years ago, but "real" work
| got it the way.  I'll ping you off list for a follow up.

Have a look at the SVN archive for Rcpp on R-Forge and at Rcpp 0.7.1 which
should be on CRAN "soon" (within days).  You may like what Romain added as
class RObject; there is more coming for environments etc, and external
pointers already got added.

That said, we are trying to maintain the fine balance of keeping the old
interface of Rcpp maintained (as it is being used in a number of places, and
as I feel quite strongly about maintaining published interfaces) yet add a
few 'interesting' things which Romain had been pushing real well for our use
in RProtoBuf and other places.  

There is also a mailing list for Rcpp-dev (with the usual subscribe-to-post)
tricks.

Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?

2010-01-02 Thread Romain Francois

On 01/02/2010 10:26 PM, Whit Armstrong wrote:


Romain,

Is the use of UNPROTECT_PTR discouraged?  I wonder why you haven't
considered using it instead.


that's not what I said. It just felt not as easy to use as just grab a 
SEXP and say R_PreserveObject( x ) and then later R_ReleaseObject.


I did not mean to discourage people from using it. It is probably very 
stable indeed since it is used all over the place in the parser, and 
since the parser is probably the most used code in the R source tree, we 
would know if it did not work properly.



I have a similar project that uses a ref counting scheme and handles
the deletion of the shared object with UNPROTECT_PTR.  This method has
been working fine, but if there are reasons I should not be using it,
I would certainly like to know.


From memory.c: /* "unprotect_ptr" remove pointer from somewhere in R_PPStack */


The code is very simple, it just walks backwards down the list until
it hits your object.  Unless you are adding an obscene number of items
to the protection stack in a given call, finding your object should be
very fast.

My example is here:
http://github.com/armstrtw/rabstraction/blob/master/R.backend.hpp

>

template
   Rbackend::~Rbackend() {
 if(release_data_) {
   if(R_object_!=R_NilValue) {
 UNPROTECT_PTR(R_object_);
   }
 }
   }


Thanks. I'll have a look.


I still think all these Rcpp projects should come together some day.
Dirk and I talked about combining about two years ago, but "real" work
got it the way.  I'll ping you off list for a follow up.


Dirk is proving very cooperative and the "Rcpp" package is under a lot 
of changes currently in its new Rcpp namespace, while keeping the 
current interface.


Our main class is Rcpp::RObject that wraps any SEXP into a C++ object 
that basically R_PreserveObject when it is created, and R_ReleaseObject 
when the object is destroyed.


deriving from "RObject", we have "Environment" to deal specifically with 
environments (ENVSXP) , "Symbol" to deal with symbols (SYMSXP), 
"Language" for calls (LANGSXP) and the template XPTr for external 
pointers. The rest (vectors, functions, ...) will follow.


I'd suggest using the Rcpp mailing list for a follow up. 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel



-Whit


On Sat, Jan 2, 2010 at 12:24 PM, Romain Francois
  wrote:

On 01/02/2010 06:01 PM, Simon Urbanek wrote:



On Jan 2, 2010, at 5:07 AM, Romain Francois wrote:


Hello,

We are currently making lots of changes to Rcpp (see the open Rcpp
mailing list if interested [1] in the details).

We are now using [2] R_PreserveObject and R_ReleaseObject to manage
garbage collection instead of the PROTECT/UNPROTECT dance. This seems to
work well, but I was wondering if there was documentation about it.



I don't think so - the only documentation is the comment in the source.



Fair enough. FWIW, some mention of it in the R-ints or R-exts could be
valuable.



In particular, if we preserve the same SEXP twice (or more), should we
implement some sort of reference counting ?



Preserve/Release are for managing objects that are supposed to survive
past the call and are not tied to any other R object. PROTECT/UNPROTECT are
for temporary preservation within a call.

Although you're right that Preserve/Release is effectively implemented as
a stack at the moment it is not stated explicitly anywhere (this goes all
the way back to R 0.64 so chances are that only Ross can comment..).
However, for practical purposes it would be potentially dangerous to have it
work like a flag because you can simply never know whether the same object
was not already registered by some other code.



Reading the source (below, from memory.c) I think not, but some
confirmation would help.

void R_PreserveObject(SEXP object)
{
R_PreciousList = CONS(object, R_PreciousList);
}

static SEXP RecursiveRelease(SEXP object, SEXP list)
{
if (!isNull(list)) {
if (object == CAR(list))
return CDR(list);
else
CDR(list) = RecursiveRelease(object, CDR(list));
}
return list;
}

void R_ReleaseObject(SEXP object)
{
R_PreciousList =  RecursiveRelease(object, R_PreciousList);
}


I'd also be interested if there is some ideas on the relative efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.



PROTECT/UNPROTECT is more efficient because all it does is a pointer
assignment -- Preserve has to allocate new node and fill it with all parts.
On release the extra node is still floating in the GC pool etc.

Normally there is not really a question of choice - within a call you want
to use PROTET/UNPROTECT and for anything else you simply cannot use it so
you have to use Preserve/Release. As a side note Preserve/Release is merely
a convenience call, it is often more efficient to simply assign the object
to another object you have control of (which is all Preserve really does).

Cheers,
Simon


Thanks for this advice. Thi

Re: [Rd] R_PreserveObject, R_ReleaseObject : reference counting needed ?

2010-01-02 Thread Whit Armstrong
Romain,

> that's not what I said. It just felt not as easy to use as just grab a SEXP
> and say R_PreserveObject( x ) and then later R_ReleaseObject.

Didn't mean to suggest that you said that.  I was asking the list
whether it's discouraged.

Will follow up on the Rcpp list.

-Whit

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readCitationFile encoding

2010-01-02 Thread Jens Elkner
On Thu, Dec 31, 2009 at 02:50:00PM -0500, Simon Urbanek wrote:
> 
> On Dec 31, 2009, at 10:05 AM, Jens Elkner wrote:
> 
> > OT: Wondering, what does "recommended portable encoding names" in this
> > context mean (e.g. AFAIK 'latin1' is not a standardized name ...).
> 
> It is in R -- since no standard name for that encoding exists in general R 
> maps "latin1" to whatever the OS expects this to be -- see ?Encoding.
  
Ahh - ok. So since our staff, student, guests are using all kind of OS
(i.e. win, solaris, linux, macosx) it is better to recommend
ISO8859-15/UTF-8 instead of latin*, understood.
  
> > Ahh - good hint - help("check") wasn't really fruitful. However,
> > 
> > elkner.idev2 build/R-2.10.1/bin > ./check --help
> > Can't locate File/Copy/Recursive.pm in @INC (@INC contains: ...
> > 
> > So still need to do a little bit tweaking ...
> > 
> 
> .. rather reading the docs ;).

Hmm - think I already skimmed a lot of the onine docs. But at least wrt.
check everything is for my taste quite shallow ... ;-(

> It's R CMD check --help (since we're on R-devel knowledge of basics is 
> assumed).

Well - it doesn't tell the details as well...

BTW: the bad infinite recursion fails for 2.10.1 2009-12-14 patched (but
not in the unpatched version):

running code in 'ok-errors.R' ...*** Error code 1
The following command caused the error:
LANGUAGE=en LC_ALL=C SRCDIR=. R_DEFAULT_PACKAGES= ../bin/R --vanilla < 
ok-errors.R > ok-errors.Rout 2>&1 || (mv ok-errors.Rout ok-errors.Rout.fail && 
exit 1)
make: Fatal error: Command failed for target `ok-errors.Rout'
Current working directory /export/scratch/elkner/build/R-2.10.1/tests
*** Error code 1

Without optimization

> ## bad infinite recursion / on.exit / ... interactions
> bar <- function() 1+1
> foo <- function() { on.exit(bar()); foo() }
> foo() # now simple "infinite recursion"
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: C stack usage is too close to the limit
Error: segfault from C stack overflow
Execution halted

With optimization (-xO5 -xautopar)
--
> ## bad infinite recursion / on.exit / ... interactions
> bar <- function() 1+1
> foo <- function() { on.exit(bar()); foo() }
> foo() # now simple "infinite recursion"

 *** caught segfault ***
address fd7fff3fff90, cause 'memory not mapped'

Traceback:
 1: foo()
...
2489: foo()
aborting ...

With optimization (-xO5 -xautopar) but unpatched 2.10.1
---
> ## bad infinite recursion / on.exit / ... interactions
> bar <- function() 1+1
> foo <- function() { on.exit(bar()); foo() }
> foo() # now simple "infinite recursion"
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
>

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Duncan Murdoch

On 02/01/2010 3:16 PM, Laurent Gautier wrote:

On 1/2/10 8:53 PM, Duncan Murdoch wrote:

Simon Urbanek wrote:

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:


On 1/2/10 5:56 PM, Duncan Murdoch wrote:

On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from reading the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:

(...)


I'd also be interested if there is some ideas on the relative
efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.

PROTECT/UNPROTECT is trading granularity for speed. It is a stack with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack

UNPROTECT_PTR is also possible, which does a linear search through the
stack and unprotects something possibly deep within it. There is also
REPROTECT which allows you to replace an entry within the stack.

I would guess that UNPROTECT_PTR is more efficient than
RecursiveRelease
because it doesn't use so much stack space when it needs to go deep
into
the stack to release, but it is possible the compiler recognizes the
tail recursion and RecursiveRelease is implemented efficiently. In that
case it could be more efficient than UNPROTECT_PTR, which has to move
all the other entries down to fill the newly vacated space. Really the
only reliable way to answer efficiency questions like this is to try
both ways and see which works better in your application.

Thanks. I did not know about UNPROTECT_PTR.
I had concerns over the stack usage, but so far it did not prove too
much of a problem. Still, why isn't the tail recursion explicitly
made an iteration then ? This would take the "may be the compiler
figures it out, may be not" variable out of the equation.


Careful - the protection stack (bookkeeping by R) has nothing to do
with the C function call stack hence it has nothing to do with the
compiler. The protection stack is global so usually you don't run out
of it unless something goes horribly wrong (=infinite loop).

I think Laurent was referring to RecursiveRelease, which could use a lot
of C stack if you choose to release something that is deep in the list,
since it checks the head, and if that doesn't match, calls itself again
on the rest of the list. (I checked, and at least one version of gcc
doesn't recognize the tail recursion: it really does generate a
recursive call.)

Laurent asked why it isn't optimized to avoid the recursion: I think the
answer is simply because it is so rarely used that nobody has bothered.


Yes, I was referring to RecursiveRelease. Sorry if this was not clear.

What are the chances for a patch to be accepted ? At first sight(*), 
making that tail recursion an iterative function is not a major 
undertaking, and reviewing the patch be fairly straightforward... but I 
can always use that time otherwise if the answer to the question is "nil".


I don't think I would want to review such a patch (I don't know the 
memory manager well, I don't know that there is really a case where it 
matters enough to be worth doing), so I'd say if you don't get a message 
from a core member volunteering to do so, you should assume it won't be 
accepted.  But that doesn't mean you shouldn't write the code for your 
own internal use and edification, and if you can put together a demo 
that shows it really makes a big difference in a realistic situation, 
you might get a different response.


Duncan Murdoch




L.




Duncan Murdoch

Another possibility is to maintain your own list or environment of
objects, and just protect/preserve the list as a whole.

Interesting idea, this would let one perform his/her own bookkeeping
on the list/environment. How is R garbage collection checking
contained objects ? (I am thinking performances here, and may be
cyclic references).


You don't really want to care because the GC is global for all objects
(and cycles are supported by the GC in R) - so whether you keep it
yourself or Preserve/Release is practically irrelevant (the protection
stack is handled separately).

As for keeping your own list -- if you really care about performance
that much (to be honest in vast majority of cases it doesn't matter)
you have to be careful how you implement it. Technically the fastest
way is preallocated generic vector but it really depends on how you
deal with the access so you can easily perform worse than
Preserve/Release if you're not careful.

As a side note - the best way (IMHO) to deal with all those issues is
to use external pointers because a) you get very efficient C
finalizers b) you can directly (and very efficiently) tie lifespan of
other objects to the same SEXP and c) as guardians they can nicely
track other objects that hold them.

Cheers,
Simon




L.



Duncan Murdoch


HTH,


L.





Thanks,

Romain

[1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
[2]
http://r-forge.r-project.org/plugins/scmsvn/v

Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Romain Francois

On 01/02/2010 11:12 PM, Duncan Murdoch wrote:


On 02/01/2010 3:16 PM, Laurent Gautier wrote:

On 1/2/10 8:53 PM, Duncan Murdoch wrote:

Simon Urbanek wrote:

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:


On 1/2/10 5:56 PM, Duncan Murdoch wrote:

On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from reading
the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:

(...)


I'd also be interested if there is some ideas on the relative
efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.

PROTECT/UNPROTECT is trading granularity for speed. It is a stack
with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack

UNPROTECT_PTR is also possible, which does a linear search through
the
stack and unprotects something possibly deep within it. There is also
REPROTECT which allows you to replace an entry within the stack.

I would guess that UNPROTECT_PTR is more efficient than
RecursiveRelease
because it doesn't use so much stack space when it needs to go deep
into
the stack to release, but it is possible the compiler recognizes the
tail recursion and RecursiveRelease is implemented efficiently. In
that
case it could be more efficient than UNPROTECT_PTR, which has to move
all the other entries down to fill the newly vacated space. Really
the
only reliable way to answer efficiency questions like this is to try
both ways and see which works better in your application.

Thanks. I did not know about UNPROTECT_PTR.
I had concerns over the stack usage, but so far it did not prove too
much of a problem. Still, why isn't the tail recursion explicitly
made an iteration then ? This would take the "may be the compiler
figures it out, may be not" variable out of the equation.


Careful - the protection stack (bookkeeping by R) has nothing to do
with the C function call stack hence it has nothing to do with the
compiler. The protection stack is global so usually you don't run out
of it unless something goes horribly wrong (=infinite loop).

I think Laurent was referring to RecursiveRelease, which could use a lot
of C stack if you choose to release something that is deep in the list,
since it checks the head, and if that doesn't match, calls itself again
on the rest of the list. (I checked, and at least one version of gcc
doesn't recognize the tail recursion: it really does generate a
recursive call.)

Laurent asked why it isn't optimized to avoid the recursion: I think the
answer is simply because it is so rarely used that nobody has bothered.


Yes, I was referring to RecursiveRelease. Sorry if this was not clear.

What are the chances for a patch to be accepted ? At first sight(*),
making that tail recursion an iterative function is not a major
undertaking, and reviewing the patch be fairly straightforward... but
I can always use that time otherwise if the answer to the question is
"nil".


I don't think I would want to review such a patch (I don't know the
memory manager well, I don't know that there is really a case where it
matters enough to be worth doing), so I'd say if you don't get a message
from a core member volunteering to do so, you should assume it won't be
accepted. But that doesn't mean you shouldn't write the code for your
own internal use and edification, and if you can put together a demo
that shows it really makes a big difference in a realistic situation,
you might get a different response.

Duncan Murdoch


From what I understand, this has little to do with the memory manager, 
and resumes to the simple problem "how to remove an object from a list".


Something like this, using the amazing inline and inspect packages:

require( inline )
require( inspect )

remover <- cfunction(signature( list = "language", object = 
"environment" ), '

if( !isNull( list ) ){
SEXP x = list ;
SEXP y ;
while( CAR(x) != object && CADR(x) != R_NilValue ){
y = x ;
x = CDR(x) ;
}
if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
}
return list ;
', Rcpp=FALSE, verbose=FALSE )

e <- new.env()
call <- call( "foo", e, e, 1:10, 3 )
call
# inspect( call )
result <- remover( call ,e )
result
# inspect( result )

gives this :

foo(10, , 0, , 1) 

@0x9f4e0d0 06 LANGSXP [NAM(2)] 

  @0x9f4e204 01 SYMSXP [] "foo" 

  @0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056) 

  @0x9f4d564 04 ENVSXP [NAM(1)] 

  FRAME: 

@0x9e94a10 00 NILSXP [MARK,NAM(2)] 


  ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
  HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
  @0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344)
  @0x9f4d564 04 ENVSXP [NAM(1)]
  FRAME:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
  ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
  HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(

Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Romain Francois

On 01/02/2010 11:41 PM, Romain Francois wrote:


On 01/02/2010 11:12 PM, Duncan Murdoch wrote:


On 02/01/2010 3:16 PM, Laurent Gautier wrote:

On 1/2/10 8:53 PM, Duncan Murdoch wrote:

Simon Urbanek wrote:

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:


On 1/2/10 5:56 PM, Duncan Murdoch wrote:

On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from reading
the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:

(...)


I'd also be interested if there is some ideas on the relative
efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.

PROTECT/UNPROTECT is trading granularity for speed. It is a stack
with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack

UNPROTECT_PTR is also possible, which does a linear search through
the
stack and unprotects something possibly deep within it. There is
also
REPROTECT which allows you to replace an entry within the stack.

I would guess that UNPROTECT_PTR is more efficient than
RecursiveRelease
because it doesn't use so much stack space when it needs to go deep
into
the stack to release, but it is possible the compiler recognizes the
tail recursion and RecursiveRelease is implemented efficiently. In
that
case it could be more efficient than UNPROTECT_PTR, which has to
move
all the other entries down to fill the newly vacated space. Really
the
only reliable way to answer efficiency questions like this is to try
both ways and see which works better in your application.

Thanks. I did not know about UNPROTECT_PTR.
I had concerns over the stack usage, but so far it did not prove too
much of a problem. Still, why isn't the tail recursion explicitly
made an iteration then ? This would take the "may be the compiler
figures it out, may be not" variable out of the equation.


Careful - the protection stack (bookkeeping by R) has nothing to do
with the C function call stack hence it has nothing to do with the
compiler. The protection stack is global so usually you don't run out
of it unless something goes horribly wrong (=infinite loop).

I think Laurent was referring to RecursiveRelease, which could use a
lot
of C stack if you choose to release something that is deep in the list,
since it checks the head, and if that doesn't match, calls itself again
on the rest of the list. (I checked, and at least one version of gcc
doesn't recognize the tail recursion: it really does generate a
recursive call.)

Laurent asked why it isn't optimized to avoid the recursion: I think
the
answer is simply because it is so rarely used that nobody has bothered.


Yes, I was referring to RecursiveRelease. Sorry if this was not clear.

What are the chances for a patch to be accepted ? At first sight(*),
making that tail recursion an iterative function is not a major
undertaking, and reviewing the patch be fairly straightforward... but
I can always use that time otherwise if the answer to the question is
"nil".


I don't think I would want to review such a patch (I don't know the
memory manager well, I don't know that there is really a case where it
matters enough to be worth doing), so I'd say if you don't get a message
from a core member volunteering to do so, you should assume it won't be
accepted. But that doesn't mean you shouldn't write the code for your
own internal use and edification, and if you can put together a demo
that shows it really makes a big difference in a realistic situation,
you might get a different response.

Duncan Murdoch


 From what I understand, this has little to do with the memory manager,
and resumes to the simple problem "how to remove an object from a list".

Something like this, using the amazing inline and inspect packages:

require( inline )
require( inspect )

remover <- cfunction(signature( list = "language", object =
"environment" ), '
if( !isNull( list ) ){
SEXP x = list ;
SEXP y ;
while( CAR(x) != object && CADR(x) != R_NilValue ){
y = x ;
x = CDR(x) ;
}
if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
}
return list ;
', Rcpp=FALSE, verbose=FALSE )

e <- new.env()
call <- call( "foo", e, e, 1:10, 3 )
call
# inspect( call )
result <- remover( call ,e )
result
# inspect( result )

gives this :

foo(10, , 0, , 1)
@0x9f4e0d0 06 LANGSXP [NAM(2)]
@0x9f4e204 01 SYMSXP [] "foo"
@0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056)
@0x9f4d564 04 ENVSXP [NAM(1)]
FRAME:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
@0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344)
@0x9f4d564 04 ENVSXP [NAM(1)]
FRAME:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
@0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709)
NULL
==
foo(10, 0, , 1)
@0x9f4e0d0 06 LANGSXP [NAM(2)]
@0x9f4e204 01 SYMSXP [] "foo"
@0xa907b7

Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Laurent Gautier

On 1/2/10 11:41 PM, Romain Francois wrote:

On 01/02/2010 11:12 PM, Duncan Murdoch wrote:


On 02/01/2010 3:16 PM, Laurent Gautier wrote:

On 1/2/10 8:53 PM, Duncan Murdoch wrote:

Simon Urbanek wrote:

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:


On 1/2/10 5:56 PM, Duncan Murdoch wrote:

On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from reading
the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:

(...)

(...)


I don't think I would want to review such a patch (I don't know the
memory manager well, I don't know that there is really a case where it
matters enough to be worth doing), so I'd say if you don't get a message
from a core member volunteering to do so, you should assume it won't be
accepted. But that doesn't mean you shouldn't write the code for your
own internal use and edification, and if you can put together a demo
that shows it really makes a big difference in a realistic situation,
you might get a different response.

Duncan Murdoch


 From what I understand, this has little to do with the memory manager,
and resumes to the simple problem "how to remove an object from a list".


I would generalize even further, up to: "how to make a tail recursion 
into an interation" (and that often boils down to writing a for loop - 
little edification to get from it, I suppose... one probably wants to 
write it to see it included).


I see below that you are living up to the enthusiasm claimed, and had a 
shot at it (and even did benchmark to demonstrate the expected benefit - 
impressive).


Let's see what happens with it...


L.


Something like this, using the amazing inline and inspect packages:

require( inline )
require( inspect )

remover <- cfunction(signature( list = "language", object =
"environment" ), '
if( !isNull( list ) ){
SEXP x = list ;
SEXP y ;
while( CAR(x) != object && CADR(x) != R_NilValue ){
y = x ;
x = CDR(x) ;
}
if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
}
return list ;
', Rcpp=FALSE, verbose=FALSE )

e <- new.env()
call <- call( "foo", e, e, 1:10, 3 )
call
# inspect( call )
result <- remover( call ,e )
result
# inspect( result )

gives this :

foo(10, , 0, , 1)
@0x9f4e0d0 06 LANGSXP [NAM(2)]
@0x9f4e204 01 SYMSXP [] "foo"
@0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056)
@0x9f4d564 04 ENVSXP [NAM(1)]
FRAME:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
@0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344)
@0x9f4d564 04 ENVSXP [NAM(1)]
FRAME:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
@0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709)
NULL
==
foo(10, 0, , 1)
@0x9f4e0d0 06 LANGSXP [NAM(2)]
@0x9f4e204 01 SYMSXP [] "foo"
@0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056)
@0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344)
@0x9f4d564 04 ENVSXP [NAM(2)]
FRAME:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
@0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709)
NULL



so it boils down to this as a replacement to RecursiveRelease :

/**
* Removes the first instance of object with the list
*/
static SEXP RemoveFromList( SEXP object, SEXP list){
if( !isNull( list ) ){
SEXP x = list ;
SEXP y ;
while( CAR(x) != object && CADR(x) != R_NilValue ){
y = x ;
x = CDR(x) ;
}
if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
}
return list ;
}


Romain




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel Digest, Vol 83, Issue 2

2010-01-02 Thread Duncan Murdoch

On 02/01/2010 6:35 PM, Laurent Gautier wrote:

On 1/2/10 11:41 PM, Romain Francois wrote:

On 01/02/2010 11:12 PM, Duncan Murdoch wrote:

On 02/01/2010 3:16 PM, Laurent Gautier wrote:

On 1/2/10 8:53 PM, Duncan Murdoch wrote:

Simon Urbanek wrote:

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:


On 1/2/10 5:56 PM, Duncan Murdoch wrote:

On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from reading
the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:

(...)

(...)

I don't think I would want to review such a patch (I don't know the
memory manager well, I don't know that there is really a case where it
matters enough to be worth doing), so I'd say if you don't get a message
from a core member volunteering to do so, you should assume it won't be
accepted. But that doesn't mean you shouldn't write the code for your
own internal use and edification, and if you can put together a demo
that shows it really makes a big difference in a realistic situation,
you might get a different response.

Duncan Murdoch

 From what I understand, this has little to do with the memory manager,
and resumes to the simple problem "how to remove an object from a list".


I would generalize even further, up to: "how to make a tail recursion 
into an interation" (and that often boils down to writing a for loop - 
little edification to get from it, I suppose... one probably wants to 
write it to see it included).


Just one followup:  as Luke mentioned, I was wrong, it isn't really a 
tail recursion (which is why gcc didn't optimize it away).


Duncan Murdoch



I see below that you are living up to the enthusiasm claimed, and had a 
shot at it (and even did benchmark to demonstrate the expected benefit - 
impressive).


Let's see what happens with it...


L.


Something like this, using the amazing inline and inspect packages:

require( inline )
require( inspect )

remover <- cfunction(signature( list = "language", object =
"environment" ), '
if( !isNull( list ) ){
SEXP x = list ;
SEXP y ;
while( CAR(x) != object && CADR(x) != R_NilValue ){
y = x ;
x = CDR(x) ;
}
if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
}
return list ;
', Rcpp=FALSE, verbose=FALSE )

e <- new.env()
call <- call( "foo", e, e, 1:10, 3 )
call
# inspect( call )
result <- remover( call ,e )
result
# inspect( result )

gives this :

foo(10, , 0, , 1)
@0x9f4e0d0 06 LANGSXP [NAM(2)]
@0x9f4e204 01 SYMSXP [] "foo"
@0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056)
@0x9f4d564 04 ENVSXP [NAM(1)]
FRAME:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
@0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344)
@0x9f4d564 04 ENVSXP [NAM(1)]
FRAME:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
@0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709)
NULL
==
foo(10, 0, , 1)
@0x9f4e0d0 06 LANGSXP [NAM(2)]
@0x9f4e204 01 SYMSXP [] "foo"
@0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056)
@0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344)
@0x9f4d564 04 ENVSXP [NAM(2)]
FRAME:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
ENCLOS:
@0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
HASHTAB:
@0x9e94a10 00 NILSXP [MARK,NAM(2)]
@0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709)
NULL



so it boils down to this as a replacement to RecursiveRelease :

/**
* Removes the first instance of object with the list
*/
static SEXP RemoveFromList( SEXP object, SEXP list){
if( !isNull( list ) ){
SEXP x = list ;
SEXP y ;
while( CAR(x) != object && CADR(x) != R_NilValue ){
y = x ;
x = CDR(x) ;
}
if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
}
return list ;
}


Romain




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R object protection [Was: R-devel Digest, Vol 83, Issue 2]

2010-01-02 Thread Simon Urbanek
On Jan 2, 2010, at 4:08 PM, Laurent Gautier wrote:

> On 1/2/10 8:28 PM, Simon Urbanek wrote:
>> 
>> On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:
>> 
>>> On 1/2/10 5:56 PM, Duncan Murdoch wrote:
 On 02/01/2010 11:36 AM, Laurent Gautier wrote:
> [Disclaimer: what is below reflects my understanding from
> reading the R source, others will correct where deemed
> necessary]
> 
> On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:
>>> 
> (...)
> 
 Another possibility is to maintain your own list or environment
 of objects, and just protect/preserve the list as a whole.
>>> 
>>> Interesting idea, this would let one perform his/her own
>>> bookkeeping on the list/environment. How is R garbage collection
>>> checking contained objects ? (I am thinking performances here, and
>>> may be cyclic references).
>>> 
>> 
>> You don't really want to care because the GC is global for all
>> objects (and cycles are supported by the GC in R) - so whether you
>> keep it yourself or Preserve/Release is practically irrelevant (the
>> protection stack is handled separately).
> 
> I guess that I'll have to know in order to understand that I don't really 
> want to care. ;-)
> The garbage collector must somehow know if an object is available for 
> collection (and will have to check whether an object is PROTECTed or not, or 
> Preserved or not).
> I suppose that upon being called the garbage collector will first look into 
> the PROTECTed and Preserved objects, mark them as unavailble for collection, 
> then recursively mark objects contained in them.
> 

The GC marks recursively from all known roots of which Preserved list is one of 
many and all elements of the protection stack are treated as such as well (FWIW 
the Preserved and protected list are in that order the last two). Since this 
involves (by definition) all live objects it doesn't matter to which other 
object you assign the node. The only detail is that protection stack does not 
change the generation (since there is no real node to assign to).


>> As for keeping your own list -- if you really care about performance
>> that much (to be honest in vast majority of cases it doesn't matter)
>> you have to be careful how you implement it. Technically the fastest
>> way is preallocated generic vector but it really depends on how you
>> deal with the access so you can easily perform worse than
>> Preserve/Release if you're not careful.
> 
> Releasing being of linear complexity, having few thousands of Preserved 
> objects not being anticipated as an extraordinary situation, and 
> Preserve/Release cycles being quite frequent, I start minding a bit about the 
> performance. Keeping my own list would let me experiment with various 
> strategies (and eventually offer
> 

Sure - what I meant is that you have to optimize for one thing or the other so 
you have to be careful what you do.


>> As a side note - the best way (IMHO) to deal with all those issues is
>> to use external pointers because a) you get very efficient C
>> finalizers b) you can directly (and very efficiently) tie lifespan of
>> other objects to the same SEXP and c) as guardians they can nicely
>> track other objects that hold them.
> 
> Thanks. I am not certain to follow everything. Are you suggesting that rather 
> than Preserve-ing/Release-ing a list/environment that would act as a guardian 
> for several objects, one should use an external pointer (to an arbitrary C 
> pointer) ? In that case, how does one indicate that an external pointer acts 
> as a container ?
> 
> Or are you suggesting that rather than Preserve-in/Release-ing R objects one 
> should use an external pointer acting as a proxy for a SEXP (argument "prot" 
> in R_MakeExternalPtr(void *p, SEXP tag, SEXP prot) ) ?
> (but in that case the external pointer will itself have to go through 
> Preserve/Release cycles...)
> 

I was guessing that you use this in conjunction with some C++ magic not just 
plain R objects and thus you have to deal with two life spans. From the other 
messages I think you are dealing with the simple situation of wrapping an R 
object as reference in the other system with explicit memory management (i.e. 
in C++ you have explicit new/delete life cycle) in which case you really don't 
need anything more than Preserve. It is more interesting when you want to track 
the life of R objects without imposing the life span - i.e when you want to 
know when an object in R is collected such that you can delete it from the 
other system (i.e. you don't explicitly retain it by the reference).

Cheers,
Simon



 
> 
> 
> HTH,
> 
> 
> L.
> 
> 
> 
> 
>> Thanks,
>> 
>> Romain
>> 
>> [1]http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
>> [2]
>> http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/src/RObject.cpp?rev=255&root=rcpp&view=markup
>> 
>> 
>> 
>> 
> -- Romain Francois Professional R 

[Rd] inspect [Was: R-devel Digest, Vol 83, Issue 2]

2010-01-02 Thread Simon Urbanek

On Jan 2, 2010, at 5:41 PM, Romain Francois wrote:

> On 01/02/2010 11:12 PM, Duncan Murdoch wrote:
>> 
>> On 02/01/2010 3:16 PM, Laurent Gautier wrote:
>>> On 1/2/10 8:53 PM, Duncan Murdoch wrote:
 Simon Urbanek wrote:
> On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:
> 
>> On 1/2/10 5:56 PM, Duncan Murdoch wrote:
>>> On 02/01/2010 11:36 AM, Laurent Gautier wrote:
 [Disclaimer: what is below reflects my understanding from reading
 the
 R source, others will correct where deemed necessary]
 
 On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:
>> (...)
>> 
> I'd also be interested if there is some ideas on the relative
> efficiency
> of the preserve/release mechanism compared to PROTECT/UNPROTECT.
 PROTECT/UNPROTECT is trading granularity for speed. It is a stack
 with
 only tow operations possible:
 - push 1 object into the stack
 - pull (unprotect) N last objects from the stack
>>> UNPROTECT_PTR is also possible, which does a linear search through
>>> the
>>> stack and unprotects something possibly deep within it. There is also
>>> REPROTECT which allows you to replace an entry within the stack.
>>> 
>>> I would guess that UNPROTECT_PTR is more efficient than
>>> RecursiveRelease
>>> because it doesn't use so much stack space when it needs to go deep
>>> into
>>> the stack to release, but it is possible the compiler recognizes the
>>> tail recursion and RecursiveRelease is implemented efficiently. In
>>> that
>>> case it could be more efficient than UNPROTECT_PTR, which has to move
>>> all the other entries down to fill the newly vacated space. Really
>>> the
>>> only reliable way to answer efficiency questions like this is to try
>>> both ways and see which works better in your application.
>> Thanks. I did not know about UNPROTECT_PTR.
>> I had concerns over the stack usage, but so far it did not prove too
>> much of a problem. Still, why isn't the tail recursion explicitly
>> made an iteration then ? This would take the "may be the compiler
>> figures it out, may be not" variable out of the equation.
>> 
> Careful - the protection stack (bookkeeping by R) has nothing to do
> with the C function call stack hence it has nothing to do with the
> compiler. The protection stack is global so usually you don't run out
> of it unless something goes horribly wrong (=infinite loop).
 I think Laurent was referring to RecursiveRelease, which could use a lot
 of C stack if you choose to release something that is deep in the list,
 since it checks the head, and if that doesn't match, calls itself again
 on the rest of the list. (I checked, and at least one version of gcc
 doesn't recognize the tail recursion: it really does generate a
 recursive call.)
 
 Laurent asked why it isn't optimized to avoid the recursion: I think the
 answer is simply because it is so rarely used that nobody has bothered.
>>> 
>>> Yes, I was referring to RecursiveRelease. Sorry if this was not clear.
>>> 
>>> What are the chances for a patch to be accepted ? At first sight(*),
>>> making that tail recursion an iterative function is not a major
>>> undertaking, and reviewing the patch be fairly straightforward... but
>>> I can always use that time otherwise if the answer to the question is
>>> "nil".
>> 
>> I don't think I would want to review such a patch (I don't know the
>> memory manager well, I don't know that there is really a case where it
>> matters enough to be worth doing), so I'd say if you don't get a message
>> from a core member volunteering to do so, you should assume it won't be
>> accepted. But that doesn't mean you shouldn't write the code for your
>> own internal use and edification, and if you can put together a demo
>> that shows it really makes a big difference in a realistic situation,
>> you might get a different response.
>> 
>> Duncan Murdoch
> 
> From what I understand, this has little to do with the memory manager, and 
> resumes to the simple problem "how to remove an object from a list".
> 
> Something like this, using the amazing inline and inspect packages:
> 
> require( inline )
> require( inspect )
> 

FWIW inspect is now part of R itself but as an internal function so you can 
either use it directly via .Internal or for compatibility
inspect <- function(...) .Internal(inspect(...))
the arguments are (x, max.depth=-1, max.elements=5) [the last one is only 
supported in R-devel].

Cheers,
Simon


> remover <- cfunction(signature( list = "language", object = "environment" ), '
>   if( !isNull( list ) ){
>   SEXP x = list ;
>   SEXP y ;
>   while( CAR(x) != object && CADR(x) != R_NilValue ){
>   y = x ;
>   x = CDR(x) ;
>   }
>   

[Rd] iterative list rm [Was: R-devel Digest, Vol 83, Issue 2]

2010-01-02 Thread Simon Urbanek

On Jan 2, 2010, at 5:41 PM, Romain Francois wrote:

> On 01/02/2010 11:12 PM, Duncan Murdoch wrote:
>> 
>> On 02/01/2010 3:16 PM, Laurent Gautier wrote:
>>> On 1/2/10 8:53 PM, Duncan Murdoch wrote:
 Simon Urbanek wrote:
> On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:
> 
>> On 1/2/10 5:56 PM, Duncan Murdoch wrote:
>>> On 02/01/2010 11:36 AM, Laurent Gautier wrote:
 [Disclaimer: what is below reflects my understanding from reading
 the
 R source, others will correct where deemed necessary]
 
 On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:
>> (...)
>> 
> I'd also be interested if there is some ideas on the relative
> efficiency
> of the preserve/release mechanism compared to PROTECT/UNPROTECT.
 PROTECT/UNPROTECT is trading granularity for speed. It is a stack
 with
 only tow operations possible:
 - push 1 object into the stack
 - pull (unprotect) N last objects from the stack
>>> UNPROTECT_PTR is also possible, which does a linear search through
>>> the
>>> stack and unprotects something possibly deep within it. There is also
>>> REPROTECT which allows you to replace an entry within the stack.
>>> 
>>> I would guess that UNPROTECT_PTR is more efficient than
>>> RecursiveRelease
>>> because it doesn't use so much stack space when it needs to go deep
>>> into
>>> the stack to release, but it is possible the compiler recognizes the
>>> tail recursion and RecursiveRelease is implemented efficiently. In
>>> that
>>> case it could be more efficient than UNPROTECT_PTR, which has to move
>>> all the other entries down to fill the newly vacated space. Really
>>> the
>>> only reliable way to answer efficiency questions like this is to try
>>> both ways and see which works better in your application.
>> Thanks. I did not know about UNPROTECT_PTR.
>> I had concerns over the stack usage, but so far it did not prove too
>> much of a problem. Still, why isn't the tail recursion explicitly
>> made an iteration then ? This would take the "may be the compiler
>> figures it out, may be not" variable out of the equation.
>> 
> Careful - the protection stack (bookkeeping by R) has nothing to do
> with the C function call stack hence it has nothing to do with the
> compiler. The protection stack is global so usually you don't run out
> of it unless something goes horribly wrong (=infinite loop).
 I think Laurent was referring to RecursiveRelease, which could use a lot
 of C stack if you choose to release something that is deep in the list,
 since it checks the head, and if that doesn't match, calls itself again
 on the rest of the list. (I checked, and at least one version of gcc
 doesn't recognize the tail recursion: it really does generate a
 recursive call.)
 
 Laurent asked why it isn't optimized to avoid the recursion: I think the
 answer is simply because it is so rarely used that nobody has bothered.
>>> 
>>> Yes, I was referring to RecursiveRelease. Sorry if this was not clear.
>>> 
>>> What are the chances for a patch to be accepted ? At first sight(*),
>>> making that tail recursion an iterative function is not a major
>>> undertaking, and reviewing the patch be fairly straightforward... but
>>> I can always use that time otherwise if the answer to the question is
>>> "nil".
>> 
>> I don't think I would want to review such a patch (I don't know the
>> memory manager well, I don't know that there is really a case where it
>> matters enough to be worth doing), so I'd say if you don't get a message
>> from a core member volunteering to do so, you should assume it won't be
>> accepted. But that doesn't mean you shouldn't write the code for your
>> own internal use and edification, and if you can put together a demo
>> that shows it really makes a big difference in a realistic situation,
>> you might get a different response.
>> 
>> Duncan Murdoch
> 
> From what I understand, this has little to do with the memory manager, and 
> resumes to the simple problem "how to remove an object from a list".
> 
> Something like this, using the amazing inline and inspect packages:
> 
> require( inline )
> require( inspect )
> 
> remover <- cfunction(signature( list = "language", object = "environment" ), '
>   if( !isNull( list ) ){
>   SEXP x = list ;
>   SEXP y ;
>   while( CAR(x) != object && CADR(x) != R_NilValue ){
>   y = x ;
>   x = CDR(x) ;
>   }
>   if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
>   }
>   return list ;

That blows up if the object is the first in the list BTW. You probably meant 
more something like

static SEXP ReleaseObj(SEXP object, SEXP list) {
if (!isNull(list)) {
if (CAR(list) != object) {

Re: [Rd] iterative list rm [Was: R-devel Digest, Vol 83, Issue 2]

2010-01-02 Thread Romain Francois

On 01/03/2010 02:04 AM, Simon Urbanek wrote:



On Jan 2, 2010, at 5:41 PM, Romain Francois wrote:


On 01/02/2010 11:12 PM, Duncan Murdoch wrote:


On 02/01/2010 3:16 PM, Laurent Gautier wrote:

On 1/2/10 8:53 PM, Duncan Murdoch wrote:

Simon Urbanek wrote:

On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:


On 1/2/10 5:56 PM, Duncan Murdoch wrote:

On 02/01/2010 11:36 AM, Laurent Gautier wrote:

[Disclaimer: what is below reflects my understanding from reading
the
R source, others will correct where deemed necessary]

On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:

(...)


I'd also be interested if there is some ideas on the relative
efficiency
of the preserve/release mechanism compared to PROTECT/UNPROTECT.

PROTECT/UNPROTECT is trading granularity for speed. It is a stack
with
only tow operations possible:
- push 1 object into the stack
- pull (unprotect) N last objects from the stack

UNPROTECT_PTR is also possible, which does a linear search through
the
stack and unprotects something possibly deep within it. There is also
REPROTECT which allows you to replace an entry within the stack.

I would guess that UNPROTECT_PTR is more efficient than
RecursiveRelease
because it doesn't use so much stack space when it needs to go deep
into
the stack to release, but it is possible the compiler recognizes the
tail recursion and RecursiveRelease is implemented efficiently. In
that
case it could be more efficient than UNPROTECT_PTR, which has to move
all the other entries down to fill the newly vacated space. Really
the
only reliable way to answer efficiency questions like this is to try
both ways and see which works better in your application.

Thanks. I did not know about UNPROTECT_PTR.
I had concerns over the stack usage, but so far it did not prove too
much of a problem. Still, why isn't the tail recursion explicitly
made an iteration then ? This would take the "may be the compiler
figures it out, may be not" variable out of the equation.


Careful - the protection stack (bookkeeping by R) has nothing to do
with the C function call stack hence it has nothing to do with the
compiler. The protection stack is global so usually you don't run out
of it unless something goes horribly wrong (=infinite loop).

I think Laurent was referring to RecursiveRelease, which could use a lot
of C stack if you choose to release something that is deep in the list,
since it checks the head, and if that doesn't match, calls itself again
on the rest of the list. (I checked, and at least one version of gcc
doesn't recognize the tail recursion: it really does generate a
recursive call.)

Laurent asked why it isn't optimized to avoid the recursion: I think the
answer is simply because it is so rarely used that nobody has bothered.


Yes, I was referring to RecursiveRelease. Sorry if this was not clear.

What are the chances for a patch to be accepted ? At first sight(*),
making that tail recursion an iterative function is not a major
undertaking, and reviewing the patch be fairly straightforward... but
I can always use that time otherwise if the answer to the question is
"nil".


I don't think I would want to review such a patch (I don't know the
memory manager well, I don't know that there is really a case where it
matters enough to be worth doing), so I'd say if you don't get a message
from a core member volunteering to do so, you should assume it won't be
accepted. But that doesn't mean you shouldn't write the code for your
own internal use and edification, and if you can put together a demo
that shows it really makes a big difference in a realistic situation,
you might get a different response.

Duncan Murdoch


 From what I understand, this has little to do with the memory manager, and resumes to 
the simple problem "how to remove an object from a list".

Something like this, using the amazing inline and inspect packages:

require( inline )
require( inspect )

remover<- cfunction(signature( list = "language", object = "environment" ), '
if( !isNull( list ) ){
SEXP x = list ;
SEXP y ;
while( CAR(x) != object&&  CADR(x) != R_NilValue ){
y = x ;
x = CDR(x) ;
}
if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
}
return list ;


That blows up if the object is the first in the list BTW. You probably meant 
more something like

static SEXP ReleaseObj(SEXP object, SEXP list) {
if (!isNull(list)) {
if (CAR(list) != object) {
SEXP c = list;
while (!isNull(CDR(c))) {
if (CADR(c) == object) {
CDR(c) = CDDR(c);
break;
}
c = CDR(c);
}
} else return CDR(list);
}
return list;
}