> -----Original Message-----
> From: William Dunlap
> Sent: Wednesday, June 10, 2009 2:35 PM
> To: '[email protected]'
> Cc: [email protected]
> Subject: RE: [Rd] reference counting bug related to break and
> next in loops
>
> Thanks Luke.
>
> I used codetools::walkCode to look for functions that returned the
> value of a loop and found a surprising number in base R.
> However, it looks like non of these functions were written to return
> anything useful (the side effects were important), so the change
> to loop-returns-NULL should let such functions use less
> memory (and time) and make them correspond better to their
> documentation.
I also use walkCode to look for assignents whose right hand sides
were loops. The only *.R file in a directory containing a current R
build that had such a thing was
./library/base/R-ex/print.R
(from help(print)) and related files in ./tests. It had
rr <- for(i in 1:3) print(1:i)
rr
That last line used to print
[1] 1 2 3
and now prints
NULL
I didn't notice a corresponding *.Rout.save file so the test won't
notice the change in behavior.
I've attached the code I used. It is pretty crude.
functionReturnsLoopValue("file")
and functionAssignsLoopValue("file") parse "file" and then return a list
of functions
that return the value of a loop and a list of assignments involving the
value of a loop,
respectively. functionAssignsLoopValue looks for ordinary assignments
(x<-while(...))
and for default argument values (function(x=while(...)){}). It does not
yet look for
things passed as arguments in function calls, c(1, while(x>0)x<-x-1,
3). The last
might pick up some false positives involving functions that don't
evaluate their
arguments.
>
> E.g., fixup.libraries.URLs is not documented to return anything
> and it returns the value of for loop:
>
> $`src/library/utils/R/windows/linkhtml.R`[[2]]
> function(lib.loc = .libPaths())
> {
> for (lib in lib.loc) {
> if(lib == .Library) next
> pg <- sort(.packages(all.available = TRUE, lib.loc = lib))
> for(pkg in pg)
> try(fixup.package.URLs(file.path(lib,pkg)), silent = TRUE)
> }
> }
>
> Bill Dunlap
> TIBCO Software Inc - Spotfire Division
> wdunlap tibco.com
>
> > -----Original Message-----
> > From: [email protected] [mailto:[email protected]]
> > Sent: Wednesday, June 10, 2009 1:05 PM
> > To: William Dunlap
> > Cc: [email protected]
> > Subject: Re: [Rd] reference counting bug related to break and
> > next in loops
> >
> > Thanks for the report.
> >
> > It turns out that a similar issue arises in while() loops without
> > break/next being involved because the test expression is evaluated
> > after the final body evaluation. After some discussion we
> decided it
> > was simplest both for implementation and documentation to have the
> > value of a loop expression always be NULL. This is now
> implemented in
> > R-devel.
> >
> > luke
> >
> > On Tue, 2 Jun 2009, William Dunlap wrote:
> >
> > > One of our R users here just showed me the following problem while
> > > investigating the return value of a while loop. I added some
> > > information
> > > on a similar bug in for loops. I think he was using 2.9.0
> > > but I see the same problem on today's development version
> of 2.10.0
> > > (svn 48703).
> > >
> > > Should the semantics of while and for loops be changed
> > slightly to avoid
> > > the memory
> > > buildup that fixing this to reflect the current docs would
> > entail? S+'s
> > > loops return nothing useful - that change was made long
> ago to avoid
> > > memory buildup resulting from semantics akin the R's
> > present semantics.
> > >
> > > Bill Dunlap
> > > TIBCO Software Inc - Spotfire Division
> > > wdunlap tibco.com
> > >
> > > --------------------Forwarded (and edited) message
> > >
> > below---------------------------------------------------------
> > ----------
> > > ----------
> > >
> > > I think I have found another reference counting bug.
> > >
> > > If you type in the following in R you get what I think is
> the wrong
> > > result.
> > >
> > >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) {
> > break }; i =
> > > i + 1; y}; q
> > > [1] 42 42 42 42 42 42 42 42 9 10
> > >
> > > I had expected [1] 42 42 42 42 42 42 42 8 9 10 which is
> > what you get
> > > if you add 0 to y in the last statement in the while loop:
> > >
> > >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) {
> > break }; i =
> > > i + 1; y + 0}; q
> > > [1] 42 42 42 42 42 42 42 8 9 10
> > >
> > > Also,
> > >
> > >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break };
> > > i<-i+1 ; if (i<=8&&i>3)next ; cat("Completing iteration",
> > i, "\n"); y};
> > > q
> > > Completing iteration 2
> > > Completing iteration 3
> > > [1] 42 42 42 42 42 42 42 42 9 10
> > >
> > > but if the last statement in the while loop is y+0 instead
> > of y I get
> > > the
> > > expected result:
> > >
> > >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break };
> > > i<-i+1 ; if (i<=8&&i>3)next ; cat("Completing iteration",
> i, "\n");
> > > y+0L}; q
> > > Completing iteration 2
> > > Completing iteration 3
> > > [1] 42 42 3 4 5 6 7 8 9 10
> > >
> > > A background to the problem is that in R a while-loop
> > returns the value
> > > of the last iteration. However there is an exception if an
> > iteration is
> > > terminated by a break or a next. Then the value is the
> value of the
> > > previously completed iteration that did not execute a
> break or next.
> > > Thus in an extreme case the value of the while may be the
> > value of the
> > > very first iteration even though it executed a million iterations.
> > >
> > > Thus to implement that correctly one needs to keep a
> > reference to the
> > > value of the last non-terminated iteration. It seems as if
> > the current R
> > > implementation does that but does not increase the
> reference counter
> > > which explains the odd behavior.
> > >
> > > The for loop example is
> > >
> > >> z<-{ tmp<-rep(pi,10);for(i in 1:10){
> tmp[i]<-i^2;if(i==9)break ; if
> > > (i<9&&i>3)next ; tmp } }
> > >> z
> > > [1] 1.000000 4.000000 9.000000 16.000000 25.000000 36.000000
> > > 49.000000
> > > [8] 64.000000 81.000000 3.141593
> > >> z<-{ tmp<-rep(pi,10);for(i in 1:10){
> tmp[i]<-i^2;if(i==9)break ; if
> > > (i<9&&i>3)next ; tmp+0 } }
> > >> z
> > > [1] 1.000000 4.000000 9.000000 3.141593 3.141593 3.141593 3.141593
> > > 3.141593
> > > [9] 3.141593 3.141593
> > >
> > > I can think of a couple of ways to solve this.
> > >
> > > 1. Increment the reference counter. This solves the
> > bug but may
> > > have serious performance implications. In the while
> example above it
> > > needs to copy y in every iteration.
> > >
> > > 2. Change the semantics of while loops by getting rid of the
> > > exception described above. When a loop is terminated with a
> > break the
> > > value of the loop would be NULL. Thus there is no need to keep a
> > > reference to the value of the last non-terminated iteration.
> > >
> > > Any opinions?
> > >
> > > ______________________________________________
> > > [email protected] mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> >
> > --
> > Luke Tierney
> > Chair, Statistics and Actuarial Science
> > Ralph E. Wareham Professor of Mathematical Sciences
> > University of Iowa Phone: 319-335-3386
> > Department of Statistics and Fax: 319-335-3017
> > Actuarial Science
> > 241 Schaeffer Hall email: [email protected]
> > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
> >
# returnsLoopValue takes an expression and returns TRUE
# if its value could be the value of a for/while/repeat loop.
returnsLoopValue <- function(e) UseMethod("returnsLoopValue")
`returnsLoopValue.default` <-
function(e)is.element(class(e), c("for","while","repeat"))
`returnsLoopValue.{` <- function(e)returnsLoopValue(e[[length(e)]]) # match }
`returnsLoopValue.(` <- function(e)returnsLoopValue(e[[length(e)]]) # match )
# Note: parse(test="function(x)x+1") returns call to `function`, not
# a function itself, so following won't be used while processing
# parse's output.
`returnsLoopValue.function` <- function(e)returnsLoopValue(functionBody(e))
# Note: Quote(if(x)"yes" else "no") is of class "if", not a call to "if".
`returnsLoopValue.if` <-
function(e)returnsLoopValue(e[[3]])||((length(e)>3)&&returnsLoopValue(e[[4]]))
functionReturnsLoopValue <- function(file, expr = parse(file)) {
require(codetools)
returnLoopValueOffenders <- list()
w <- makeCodeWalker(
call=function(e,w){
if(e[[1]]==as.name("function")){
if (returnsLoopValue(e[[3]]))
returnLoopValueOffenders[[length(returnLoopValueOffenders)+1]]
<<- e
};
for(ee in as.list(e))
if (!missing(ee))
walkCode(ee,w)},
leaf=function(e,w)NULL
)
lapply(expr, walkCode, w)
attr(returnLoopValueOffenders, "srcfile") <- attr(expr, "srcfile")
returnLoopValueOffenders
}
functionAssignsLoopValue <- function(file, expr = parse(file)) {
require(codetools)
assignLoopValueOffenders <- list()
assignmentStack <- character()
handlerEnv <- new.env()
handlerEnv$`<-` <- function(e, w) {
assignmentStack[length(assignmentStack)+1] <<- deparse(e[[2]])[1]
if (returnsLoopValue(e[[3]])) {
assignLoopValueOffenders[[length(assignLoopValueOffenders)+1]] <<- e
names(assignLoopValueOffenders)[length(assignLoopValueOffenders)] <<-
paste(assignmentStack, collapse=":")
}
walkCode(e[[3]], w)
length(assignmentStack) <<- length(assignmentStack) - 1
function(e,w)NULL
}
handlerEnv$`=` <- handlerEnv$`<<-` <- handlerEnv$`<-`
w <- makeCodeWalker(
handler=function(v, w) {
handlerEnv[[v]]
},
call=function(e,w){
# this handles any language object that handler doesn't supply a
function for
# cat("call: ", deparse(e)[1], "\n")
for(ee in as.list(e))
if (!missing(ee))
walkCode(ee,w)
},
leaf=function(e,w) {
if (typeof(e)=="pairlist") {
for(name in names(e)) {
if (!(is.name(e[[name]]) && nchar(as.character(e[[name]]))==0)) {
assignmentStack[length(assignmentStack)+1] <<- name
if (returnsLoopValue(e[[name]])) {
assignLoopValueOffenders[[length(assignLoopValueOffenders)+1]] <<- e[[name]]
names(assignLoopValueOffenders)[length(assignLoopValueOffenders)] <<-
paste(assignmentStack, collapse=":")
}
walkCode(e[[name]], w)
length(assignmentStack) <<- length(assignmentStack) - 1
}
}
}
NULL # default prints the leaves of the parse tree
}
)
lapply(expr, walkCode, w)
attr(assignLoopValueOffenders, "srcfile") <- attr(expr, "srcfile")
assignLoopValueOffenders
}
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel