On 03/27/2018 11:53 AM, Iñaki Úcar wrote:
2018-03-27 11:11 GMT+02:00 Tomas Kalibera <tomas.kalib...@gmail.com>:
On 03/27/2018 09:51 AM, Iñaki Úcar wrote:
2018-03-27 6:02 GMT+02:00  <luke-tier...@uiowa.edu>:
This has nothing to do with printing or dispatch per se. It is the
result of an internal register (R_ReturnedValue) being protected. It
gets rewritten whenever there is a jump, e.g. by an explicit return
call. So a simplified example is

new_foo <- function() {
    e <- new.env()
      reg.finalizer(e, function(e) message("Finalizer called"))
        e
        }

bar <- function(x) return(x)

bar(new_foo())
gc() # still in .Last.value
gc() # nothing

UseMethod essentially does a return call so you see the effect there.
Understood. Thanks for the explanation, Luke.

The R_ReturnedValue register could probably be safely cleared in more
places but it isn't clear exactly where. As things stand it will be
cleared on the next use of a non-local transfer of control, and those
happen frequently enough that I'm not convinced this is worth
addressing, at least not at this point in the release cycle.
I barely know the R internals, and I'm sure there's a good reason
behind this change (R 3.2.3 does not show this behaviour), but IMHO
it's, at the very least, confusing. When .Last.value is cleared, that
object loses the last reference, and I'd expect it to be eligible for
gc.

In my case, I was using an object that internally generates a bunch of
data. I discovered this because I was benchmarking the execution, and
I was running out of memory because the memory wasn't been freed as it
was supposed to. So I spent half of the day on this because I thought
I had a memory leak. :-\ (Not blaming anyone here, of course; just
making a case to show that this may be worth addressing at some
point). :-)
 From the perspective of the R user/programmer/package developer, please do
not make any assumptions on when finalizers will be run, only that they
indeed won't be run when the object is still alive. Similarly, it is not
good to make any assumptions that "gc()" will actually run a collection (and
a particular type of collection, that it will be immediately, etc). Such
guarantees would too much restrict the design space and potential
optimizations on the R internals side - and for this reason are typically
not given in other managed languages, either. I've seen R examples where
most time had been wasted tracing live objects because explicit "gc()" had
been run in a tight loop. Note in Java for instance, an explicit call to
gc() had been eventually turned into a hint only.

Once you start debugging when objects are collected, you are debugging R
internals - and surprises/changes between svn versions/etc should be
expected as well as changes in behavior caused very indirectly by code
changes somewhere else. I work on R internals and spend most of my time
debugging - that is unfortunately normal when you work on a language
runtime. Indeed, the runtime should try not to keep references to objects
for too long, but it remains to be seen whether and for what cost this could
be fixed with R_ReturnedValue.
To be precise, I was not debugging *when* objects were collected, I
was debugging *whether* objects were collected. And for that, I
necessarily need some hint about the *when*.
They would be collected eventually if you were running a non-trivial program (because there would be a jump inside).
But I think that's another discussion. My point is that, as an R user
and package developer, I expect consistency, and currently

new_foo <- function() {
   e <- new.env()
   reg.finalizer(e, function(e) message("Finalizer called"))
   e
}

bar <- function(x) return(x)

bar(new_foo())
gc() # still in .Last.value
gc() # nothing

behaves differently than

new_foo <- function() {
   e <- new.env()
   reg.finalizer(e, function(e) message("Finalizer called"))
   e
}

bar <- function(x) x

bar(new_foo())
gc() # still in .Last.value
gc() # Finalizer called!

And such a difference is not explained (AFAIK) in the documentation.
At least the help page for 'return' does not make me think that I
should not expect exactly the same behaviour if I write (or not) an
explicit 'return'.
As R user and package developer, you should have consistency in _documented_ behavior. If not, it is a bug and has to be fixed either in the documentation, or in the code. You should never depend on undocumented behavior, because that can change at any time. You cannot expect that different versions of R would behave exactly the same, not even the svn versions, that is not possible and would not be possible even if we did not change any code in R implementation, because even the OS, C compiler, hardware, and third party libraries have their specified and unspecified behavior.

Best
Tomas

Regards,
Iñaki

Best
Tomas


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to