[Rd] Language definition question - order of argument side effects
I'm interested in the difference between these two intuitively equivalent sequences that produce different results (in R version 2.13.1 (2011-07-08) 32-bit). I think R's reference counting optimization is causing this difference in behavior. > a <- 1 > a+{a[1] <- 20} [1] 21 > a <- 1 > a[1] <- 1 > a+{a[1] <- 20} [1] 40 Is one of these the "correct" answer, or is the order of side effects undefined in these statements? Section 4.3.3 of the R Language Definition just says that doing assignment in an argument to a function is "bad style", but doesn't say anything about evaluation order. In general, for primitive and internal functions, is a particular evaluation order for the arguments guaranteed? Thanks, Justin Talbot __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] pmatch inconsistency
The pmatch help (see also section 4.3.2 in the R Language Definition) claims that pmatch with duplicates.ok=FALSE provides the same functionality as R's argument matching algorithm, modulo how empty strings are matched. Here's an undocumented inconsistency between pmatch and R's argument matching algorithm: > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base > f <- function(abc, ax) 1 > f(ab=1,a=10) Error in f(ab = 1, a = 10) : formal argument "abc" matched by multiple actual arguments > pmatch(c('ab','a'), c('abc', 'ax'), duplicates.ok=FALSE) [1] 1 2 That is, pmatch doesn't consider ambiguous partial matches to be an error if the ambiguity is resolved by an earlier partial match. This leads to an order dependency in pmatch that doesn't happen with argument matching: > pmatch(c('ab','a'), c('abc', 'ax'), duplicates.ok=FALSE) [1] 1 2 > pmatch(c('a','ab'), c('abc', 'ax'), duplicates.ok=FALSE) [1] NA 1 It would be great if this were documented. At a higher level, is pmatch intended to be the same as the argument matching algorithm or is it just supposed to be "close"? Justin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Multiple return values / bug in rpart?
In the recommended package rpart (version 4.1-1), the file rpartpl.R contains the following line: return(x = x[!erase], y = y[!erase]) AFAIK, returning multiple values like this is not valid R. Is that correct? I can't seem to make it work in my own code. It doesn't appear that rpartpl.R is used anywhere, so this may have never caused an issue. But it's tripping up my R compiler. Thanks, Justin Talbot __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Status of reserved keywords and builtins
> > It would have those benefits, but it would be harder to prototype > changes by actually replacing the `if` function. Implementations that > want to optimize the calls have other ways to do it, e.g. the sorts of > things the compiler does. > Does anyone actually prototype changes to the `if` function? Allowing users to replace the definitions of reserved keywords and builtins is horribly expensive performance-wise with or without compilation. If you look at the compiler package, the way it optimizes these function calls is by breaking the language spec. See the beginnings of sections 5 and 6 of Luke's write up (http://homepage.stat.uiowa.edu/~luke/R/compiler/compiler.pdf), noting that the *default* optimization level is 2, at which level, "In addition to the inlining permitted by Level 1, functions that are syntactically special or are considered core language functions and are found via the global environment at compile time may be inlined." This is an area where a small change to the language spec would impact essentially no users and would result in a language that could be executed much more efficiently. Justin Talbot __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] On R performance
I've been working on an R performance academic project for the last couple years which has involved writing an interpreter for R from scratch and a JIT for R vector operations. With the recent comments on Julia, I thought I'd share some thoughts from my experience since they differ substantially from the common speculation on R performance. I went into the project thinking that R would be slow for the commonly cited reasons: NAs, call-by-value, immutable values, ability to dynamically add/remove variables from environments, etc. But this is largely *not* true. It does require being somewhat clever, but most of the cost of these features can be either eliminated or moved to uncommon cases that won't affect most code. And there's plenty of room for innovation here. The history of Javascript runtimes over the last decade has shown that dramatic performance improvements are possible even for difficult languages. This is good news. I think we can keep essentially everything that people like about R and still achieve great performance. So why is R performance poor now? I think the fundamental reason is related to software engineering: R is nearly impossible to experiment with, so no one tries out new performance techniques on it. There are two main issues here: 1) The R Language Definition doesn't get enough love. I could point out plenty of specific problems, omissions, etc., but I think the high-level problem is that the Language Definition currently conflates three things: 1) the actual language definition, 2) the definition of what is more properly the standard library, and 3) the implementation. This conflation hides how simple the R/S language actually is and, by assuming that the current implementation is the only implementation, obscures performance improvements that could be made by changing the implementation. 2) The R core implementation (e.g. everything in src/main) is too big. There are ~900 functions listed in names.c. This has got to be simply unmanageable. If one were to change the SEXP representation, how many internal functions would have to be checked and updated? This is a severe hinderance on improving performance. I see little value is debating changes to the language semantics until we've addressed this low hanging fruit and at least tried to make the current R/S semantics run fast. Justin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] On R performance
> > On 8 March 2012 at 11:06, Justin Talbot wrote: > | I've been working on an R performance academic project for the last > | couple years which has involved writing an interpreter for R from > | scratch and a JIT for R vector operations. > > Cool. I think John mention that once or twice and I promptly forgot. > > Can you share some numbers? > Sure, I'll give a quick summary. We're writing a paper on it right now which will have more details. We currently execute scalar R code (non-vectorized) through an interpreter we wrote from scratch. We haven't put a whole lot of time into it; it supports most of the important R semantics, but does not yet implement most of the functions in src/names.c, which limits the scope of real world code we can run. On a set of microbenchmarks (careful what you conclude from microbenchmarks!) it runs about 4-5x faster than Luke's bytecode interpreter. The interpreter is still about 3-4x slower than LuaJIT's interpreter, probably the fastest dynamic language interpreter out there, so there is room for further improvement, but not a lot. (Lua is a much cleaner language from the performance standpoint and LuaJIT's interpreter is written in assembly. We don't anticipate doing that anytime soon.) We execute vectorized code through a JIT that generates SSE code and can parallelize across multiple cores. Performance here depends greatly on the vector size and number of vectors since our performance gain primarily comes from eliminating memory accesses. For long vectors (1M+ elements) we've seen gains from about 5x-50x on a single core for plausible workloads. We don't have good numbers on the parallelization yet, but we have seen linear scalability out to 32 cores for a couple of our workloads. Scalability is clearly very task dependent and we don't expect to get large numbers across the board. One implication of having a JIT is that we now implement a lot of functionality at the R level rather than in C functions. For example, we implement matrix-vector multiplication as: r <- 0 for(i in 1L:ncol(m)) { r <- r + m[,i]*v[[i]] } This beats R's built-in matrix-vector multiplication by a factor of 2 for "large" matrices (at least one dimension larger than 1000 or so) and will parallelize without any more work from the programmer. With more work to squeeze out our JIT overheads this could be effective even for much smaller matrices. > | So why is R performance poor now? I think the fundamental reason is > | related to software engineering: R is nearly impossible to experiment > | with, so no one tries out new performance techniques on it. There are > > Did you compare notes with the CXXR project by Andrew Runnalls and his > student(s)? See http://www.cs.kent.ac.uk/projects/cxxr/ > I haven't talked to them, but I should! Looking at their slides it looks like their approach will be effective at making the R core more extensible, but it's somewhat antagonistic to pure interpreter performance. I didn't see any performance numbers, but I would guess that CXXR runs somewhat slower than the current interpreter. The key for being able to experiment with performance is for the core code to be small and well defined, not necessarily extensible. > | I see little value is debating changes to the language semantics until > | we've addressed this low hanging fruit and at least tried to make the > | current R/S semantics run fast. > > Fully agree. > > I'd add that helping expand R via the FFI also works, though it is of course > not as easy on the end user as making the core faster. > FFI is extremely important and Rcpp is a great step forward. I'll just note that FFI and performance interact. An FFI like .Call/.External exposes too much of R's internal implementation details to users, making it difficult to improve performance in the core while maintaining backwards compatibility. It would be much better if R's high-performance FFI were something like Rcpp itself, hiding almost all implementation details from the user. Just one example on FFIs. .Call/.External lets users get raw pointers to vector data (e.g. NUMERIC_POINTER). This is fine and dandy as long as all implementations store vectors contiguously in memory. But, some implementations may not want this. For example, Clojure gets high-performance updates to its immutable arrays by storing them in a tree data structure instead of flat in memory. This would be a nice technique to port to R, but it breaks .Call packages. A better FFI choice would have used something like NUMERIC_ELEMENT(x,i) to hide the details of how element i is looked up in vector x. This would have been just as fast for current packages while leaving a forward path for more performance improvements. Justin Justin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] On R performance
> > Isn't R much like Lisp under the covers? Afterall, it evolved from Scheme. > Hasn't there been a great deal of work done on optimizing Lisp over the > last 30 years? This suggests that instead of dropping the R/S semantics > and moving to another language like Julia, the proposals of Ross Ihaka > and Duncan Temple Lang could be followed to provide the familiar > R/S syntax on top of an optimized Lisp engine. > I think R started off as a Lisp-like language, but since adopting S semantics, it has diverged quite a ways. I think it's better to think of R as a combination of two languages: a dynamically-typed high-level language, much like Javascript or Lua, and an array language, like APL. I think those are the right places to be looking to see how to make R fast. Fortunately, all three of those languages have had a lot of performance work done already that R could just steal from wholesale. > Another possibility is to implement R/S on top of an optimized virtual > machine like the JVM, LLVM, etc. > I like this in theory. But in practice, I'm not sure how well it would work for R. JVM implementations of dynamic languages, like JRuby and Jython run marginally faster (30-40%) than their C interpreters. You do get the Java ecosystem, which is nice, but the performance improvements probably aren't enough to make it worthwhile. And, of course, R already has a pretty good Java connection story. LLVM is a better option; I know there's another group out there looking at R on LLVM. But I'll just note that the really high performance dynamic languages (e.g. Google's V8 implementation of Javascript and Mike Pall's LuaJIT) are hand-rolled JITs. LLVM-based implementations of dynamic languages, like Unladen Swallow, have not been particularly successful. It remains to be seen how well R would map to LLVM. Justin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] On R performance
> > Thanks for the clarification Justin. What about the S4 classes > and methods? The design resembles CLOS, and currently this > is interpreted R code. Have you addressed performance issues > associated with this? What relative impact does this have compared > with other optimizations like vectorization? > Sorry for the delay in my response. My posts keep getting stuck in moderation. I'll be honest that I haven't looked at S4 performance yet. That's the big part of R's semantics that I haven't implemented yet. I chose to delay this part largely because the language still feels very unstable around the object systems. I know R takes a lot of flak for having multiple incompatible object systems. It's frustrating that that one object system hasn't come dominate--they all have their pluses and minuses. However, one response is to point out that Javascript is in the same situation, but has still been very successful. There are lots of different OO libraries for Javascript, each one taking a slightly different tack (see this review from my office mate https://github.com/njoubert/inheritance.js/blob/master/INHERITANCE.md (caution some strong language)). I think part of the reason this has worked out for Javascript is that none of the object systems are considered part of the core language, leaving both the language implementers and the OO library designers flexibility to experiment without blocking each other. R has taken the opposite approach...incorporating multiple object systems into the core language, with the associated maintenance load on R-Core...and I'm not sure that it's been as profitable. Perhaps there are good reasons for this though. I'll admit that I haven't thought through this area much. Justin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] There is pmin and pmax each taking na.rm, how about psum?
> Because that's inconsistent with pmin and pmax when two NAs are summed. > > x = c(1,3,NA,NA,5) > y = c(2,NA,4,NA,1) > colSums(rbind(x, y), na.rm = TRUE) > [1] 3 3 4 0 6# actual > [1] 3 3 4 NA 6 # desired But your desired result would be inconsistent with sum: sum(NA,NA,na.rm=TRUE) [1] 0 >From a language definition perspective I think having psum return 0 here is right choice. R consistently distinguishes between operators that have a sensible identity (+:0, *:1, &:TRUE, |:FALSE) which return the identity if removing NAs results in no items, and those that kind of don't (pmin, pmax) which return NA. Let's not break that. (I would argue that pmin and pmax should return their actual identities too: Inf and -Inf respectively, but I can understand the current behavior.) My 2 cents on psum: R has a natural set of associative & commutative operators: +, *, &, |, pmin, pmax. These correspond directly to the reduction functions: sum, prod, all, any, min, max The current problem is that pmin and pmax are more powerful than +, *, &, and |. The right fix is to extend the rest of the associative & commutative operators to have the same power as pmin and pmax. Thus, + should have the signature: `+`(..., na.rm=FALSE), which would allow you to do things like: `+`(c(1,2),c(1,2),c(1,2),NA, na.rm=TRUE) = c(3,6) If you don't like typing `+`, you could always alias psum to `+`. Additionally, R currently has two simple reduction functions that don't have corresponding operators: range and length. Having a prange operator and a plength operator would nicely round out the language. Justin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] There is pmin and pmax each taking na.rm, how about psum?
> > Then the case for psum is more for convenience and speed -vs- > colSums(rbind(x,y), na.rm=TRUE)), since rbind will copy x and y into a new > matrix. The case for pprod is similar, plus colProds doesn't exist. > Right, and consistency; for what that's worth. >> Thus, + should have the signature: `+`(..., na.rm=FALSE), which would >> allow you to do things like: >> >> `+`(c(1,2),c(1,2),c(1,2),NA, na.rm=TRUE) = c(3,6) >> >> If you don't like typing `+`, you could always alias psum to `+`. > > But there would be a cost, wouldn't there? `+` is a dyadic .Primitive. > Changing that to take `...` and `na.rm` could slow it down (iiuc), and any > changes to the existing language are risky. For example : > `+`(1,2,3) > is currently an error. Changing that to do something might have > implications for some of the 4,000 packages (some might rely on that being > an error), with a possible speed cost too. > There would be a very slight performance cost for the current interpreter. For the new bytecode compiler though there would be no performance cost since the common binary form can be detected at compile time and an optimized bytecode can be emitted for it. Taking what's currently an error and making it legal is a pretty safe change; unless someone is currently relying on `+`(1,2,3) to return an error, which I doubt. I think the bigger question on making this change work would be on the S3 dispatch logic. I don't understand the intricacies of S3 well enough to know if this change is plausible or not. > In contrast, adding two functions that didn't exist before: psum and pprod, > seems to be a safer and simpler proposition. Definitely easier. Leaves the language a bit more complicated, but that might be the right trade off. I would strongly suggest adding pany and pall as well. I find myself wishing for them all the time. prange would be nice as well. Justin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Bug in list subset assignment due to NAMED optimization
In R version 2.15.2 (2012-10-26) i386-apple-darwin9.8.0/i386 (32-bit) I get the following: > a <- list(1) > (a[[1]] <- a) [[1]] [[1]][[1]] [1] 1 but > a <- list(1) > b <- a > (a[[1]] <- a) [[1]] [1] 1 And similarly: > a <- list(x=1) > (a$x <- a) $x $x$x [1] 1 but > a <- list(x=1) > b <- a > (a$x <- a) $x [1] 1 In both cases the result of the first sequence is wrong. It's returning the updated `a` rather than the RHS of the assignment. The second sequence in both cases is correct; the assignment to `b` increments the NAMED value causing the necessary copy to be made so the RHS is returned from the assignment. Would it be sufficient to add a check to do_subassign2_dflt and do_subassign3_dflt that creates a duplicate of the LHS if the LHS & RHS are the same object? Justin [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel