> -----Original Message----- > From: William Dunlap > Sent: Wednesday, June 10, 2009 2:35 PM > To: 'l...@stat.uiowa.edu' > Cc: r-devel@r-project.org > Subject: RE: [Rd] reference counting bug related to break and > next in loops > > Thanks Luke. > > I used codetools::walkCode to look for functions that returned the > value of a loop and found a surprising number in base R. > However, it looks like non of these functions were written to return > anything useful (the side effects were important), so the change > to loop-returns-NULL should let such functions use less > memory (and time) and make them correspond better to their > documentation.
I also use walkCode to look for assignents whose right hand sides were loops. The only *.R file in a directory containing a current R build that had such a thing was ./library/base/R-ex/print.R (from help(print)) and related files in ./tests. It had rr <- for(i in 1:3) print(1:i) rr That last line used to print [1] 1 2 3 and now prints NULL I didn't notice a corresponding *.Rout.save file so the test won't notice the change in behavior. I've attached the code I used. It is pretty crude. functionReturnsLoopValue("file") and functionAssignsLoopValue("file") parse "file" and then return a list of functions that return the value of a loop and a list of assignments involving the value of a loop, respectively. functionAssignsLoopValue looks for ordinary assignments (x<-while(...)) and for default argument values (function(x=while(...)){}). It does not yet look for things passed as arguments in function calls, c(1, while(x>0)x<-x-1, 3). The last might pick up some false positives involving functions that don't evaluate their arguments. > > E.g., fixup.libraries.URLs is not documented to return anything > and it returns the value of for loop: > > $`src/library/utils/R/windows/linkhtml.R`[[2]] > function(lib.loc = .libPaths()) > { > for (lib in lib.loc) { > if(lib == .Library) next > pg <- sort(.packages(all.available = TRUE, lib.loc = lib)) > for(pkg in pg) > try(fixup.package.URLs(file.path(lib,pkg)), silent = TRUE) > } > } > > Bill Dunlap > TIBCO Software Inc - Spotfire Division > wdunlap tibco.com > > > -----Original Message----- > > From: l...@stat.uiowa.edu [mailto:l...@stat.uiowa.edu] > > Sent: Wednesday, June 10, 2009 1:05 PM > > To: William Dunlap > > Cc: r-devel@r-project.org > > Subject: Re: [Rd] reference counting bug related to break and > > next in loops > > > > Thanks for the report. > > > > It turns out that a similar issue arises in while() loops without > > break/next being involved because the test expression is evaluated > > after the final body evaluation. After some discussion we > decided it > > was simplest both for implementation and documentation to have the > > value of a loop expression always be NULL. This is now > implemented in > > R-devel. > > > > luke > > > > On Tue, 2 Jun 2009, William Dunlap wrote: > > > > > One of our R users here just showed me the following problem while > > > investigating the return value of a while loop. I added some > > > information > > > on a similar bug in for loops. I think he was using 2.9.0 > > > but I see the same problem on today's development version > of 2.10.0 > > > (svn 48703). > > > > > > Should the semantics of while and for loops be changed > > slightly to avoid > > > the memory > > > buildup that fixing this to reflect the current docs would > > entail? S+'s > > > loops return nothing useful - that change was made long > ago to avoid > > > memory buildup resulting from semantics akin the R's > > present semantics. > > > > > > Bill Dunlap > > > TIBCO Software Inc - Spotfire Division > > > wdunlap tibco.com > > > > > > --------------------Forwarded (and edited) message > > > > > below--------------------------------------------------------- > > ---------- > > > ---------- > > > > > > I think I have found another reference counting bug. > > > > > > If you type in the following in R you get what I think is > the wrong > > > result. > > > > > >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { > > break }; i = > > > i + 1; y}; q > > > [1] 42 42 42 42 42 42 42 42 9 10 > > > > > > I had expected [1] 42 42 42 42 42 42 42 8 9 10 which is > > what you get > > > if you add 0 to y in the last statement in the while loop: > > > > > >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { > > break }; i = > > > i + 1; y + 0}; q > > > [1] 42 42 42 42 42 42 42 8 9 10 > > > > > > Also, > > > > > >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break }; > > > i<-i+1 ; if (i<=8&&i>3)next ; cat("Completing iteration", > > i, "\n"); y}; > > > q > > > Completing iteration 2 > > > Completing iteration 3 > > > [1] 42 42 42 42 42 42 42 42 9 10 > > > > > > but if the last statement in the while loop is y+0 instead > > of y I get > > > the > > > expected result: > > > > > >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break }; > > > i<-i+1 ; if (i<=8&&i>3)next ; cat("Completing iteration", > i, "\n"); > > > y+0L}; q > > > Completing iteration 2 > > > Completing iteration 3 > > > [1] 42 42 3 4 5 6 7 8 9 10 > > > > > > A background to the problem is that in R a while-loop > > returns the value > > > of the last iteration. However there is an exception if an > > iteration is > > > terminated by a break or a next. Then the value is the > value of the > > > previously completed iteration that did not execute a > break or next. > > > Thus in an extreme case the value of the while may be the > > value of the > > > very first iteration even though it executed a million iterations. > > > > > > Thus to implement that correctly one needs to keep a > > reference to the > > > value of the last non-terminated iteration. It seems as if > > the current R > > > implementation does that but does not increase the > reference counter > > > which explains the odd behavior. > > > > > > The for loop example is > > > > > >> z<-{ tmp<-rep(pi,10);for(i in 1:10){ > tmp[i]<-i^2;if(i==9)break ; if > > > (i<9&&i>3)next ; tmp } } > > >> z > > > [1] 1.000000 4.000000 9.000000 16.000000 25.000000 36.000000 > > > 49.000000 > > > [8] 64.000000 81.000000 3.141593 > > >> z<-{ tmp<-rep(pi,10);for(i in 1:10){ > tmp[i]<-i^2;if(i==9)break ; if > > > (i<9&&i>3)next ; tmp+0 } } > > >> z > > > [1] 1.000000 4.000000 9.000000 3.141593 3.141593 3.141593 3.141593 > > > 3.141593 > > > [9] 3.141593 3.141593 > > > > > > I can think of a couple of ways to solve this. > > > > > > 1. Increment the reference counter. This solves the > > bug but may > > > have serious performance implications. In the while > example above it > > > needs to copy y in every iteration. > > > > > > 2. Change the semantics of while loops by getting rid of the > > > exception described above. When a loop is terminated with a > > break the > > > value of the loop would be NULL. Thus there is no need to keep a > > > reference to the value of the last non-terminated iteration. > > > > > > Any opinions? > > > > > > ______________________________________________ > > > R-devel@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > > > -- > > Luke Tierney > > Chair, Statistics and Actuarial Science > > Ralph E. Wareham Professor of Mathematical Sciences > > University of Iowa Phone: 319-335-3386 > > Department of Statistics and Fax: 319-335-3017 > > Actuarial Science > > 241 Schaeffer Hall email: l...@stat.uiowa.edu > > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > >
# returnsLoopValue takes an expression and returns TRUE # if its value could be the value of a for/while/repeat loop. returnsLoopValue <- function(e) UseMethod("returnsLoopValue") `returnsLoopValue.default` <- function(e)is.element(class(e), c("for","while","repeat")) `returnsLoopValue.{` <- function(e)returnsLoopValue(e[[length(e)]]) # match } `returnsLoopValue.(` <- function(e)returnsLoopValue(e[[length(e)]]) # match ) # Note: parse(test="function(x)x+1") returns call to `function`, not # a function itself, so following won't be used while processing # parse's output. `returnsLoopValue.function` <- function(e)returnsLoopValue(functionBody(e)) # Note: Quote(if(x)"yes" else "no") is of class "if", not a call to "if". `returnsLoopValue.if` <- function(e)returnsLoopValue(e[[3]])||((length(e)>3)&&returnsLoopValue(e[[4]])) functionReturnsLoopValue <- function(file, expr = parse(file)) { require(codetools) returnLoopValueOffenders <- list() w <- makeCodeWalker( call=function(e,w){ if(e[[1]]==as.name("function")){ if (returnsLoopValue(e[[3]])) returnLoopValueOffenders[[length(returnLoopValueOffenders)+1]] <<- e }; for(ee in as.list(e)) if (!missing(ee)) walkCode(ee,w)}, leaf=function(e,w)NULL ) lapply(expr, walkCode, w) attr(returnLoopValueOffenders, "srcfile") <- attr(expr, "srcfile") returnLoopValueOffenders } functionAssignsLoopValue <- function(file, expr = parse(file)) { require(codetools) assignLoopValueOffenders <- list() assignmentStack <- character() handlerEnv <- new.env() handlerEnv$`<-` <- function(e, w) { assignmentStack[length(assignmentStack)+1] <<- deparse(e[[2]])[1] if (returnsLoopValue(e[[3]])) { assignLoopValueOffenders[[length(assignLoopValueOffenders)+1]] <<- e names(assignLoopValueOffenders)[length(assignLoopValueOffenders)] <<- paste(assignmentStack, collapse=":") } walkCode(e[[3]], w) length(assignmentStack) <<- length(assignmentStack) - 1 function(e,w)NULL } handlerEnv$`=` <- handlerEnv$`<<-` <- handlerEnv$`<-` w <- makeCodeWalker( handler=function(v, w) { handlerEnv[[v]] }, call=function(e,w){ # this handles any language object that handler doesn't supply a function for # cat("call: ", deparse(e)[1], "\n") for(ee in as.list(e)) if (!missing(ee)) walkCode(ee,w) }, leaf=function(e,w) { if (typeof(e)=="pairlist") { for(name in names(e)) { if (!(is.name(e[[name]]) && nchar(as.character(e[[name]]))==0)) { assignmentStack[length(assignmentStack)+1] <<- name if (returnsLoopValue(e[[name]])) { assignLoopValueOffenders[[length(assignLoopValueOffenders)+1]] <<- e[[name]] names(assignLoopValueOffenders)[length(assignLoopValueOffenders)] <<- paste(assignmentStack, collapse=":") } walkCode(e[[name]], w) length(assignmentStack) <<- length(assignmentStack) - 1 } } } NULL # default prints the leaves of the parse tree } ) lapply(expr, walkCode, w) attr(assignLoopValueOffenders, "srcfile") <- attr(expr, "srcfile") assignLoopValueOffenders }
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel