Re: [Rd] Proposed speedup of ifelse
> I propose a patch to ifelse that leverages anyNA(test) to achieve an > improvement in performance. For a test vector of length 10, the change > nearly halves the time taken and for a test of length 1 million, there > is a tenfold increase in speed. Even for small vectors, the > distributions of timings between the old and the proposed ifelse do > not intersect. For smaller vectors, your results are significantly affected by your invoking the old version via base::ifelse. You could try defining your new version as new_ifelse, and invoking the old version as just ifelse. There might still be some issues with the two versions having different context w.r.t environments, and hence looking up functions in different ways. You could copy the code of the old version and define it in the global environment just like new_ifelse. When using ifelse rather than base::ifelse, it seems the new version is slower for vectors of length 10, but faster for long vectors. Also, I'd use system.time rather than microbenchmark. The latter will mix invocations of the two functions in a way where it is unclear that garbage collection time will be fairly attributed. Also, it's a bit silly to plot the distributions of times, which will mostly reflect variations in when garbage collections at various levels occur - just the mean is what is relevant. Regards, Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] True length - length(unclass(x)) - without having to call unclass()?
Regarding the discussion of getting length(unclass(x)) without an unclassed version of x being created... There are already no copies done for length(unclass(x)) in pqR (current version of 2017-06-09 at pqR-project.org, as well as the soon-to-be-release new version). This is part of a more general facility for avoiding copies from unclass in other circumstances as well - eg, unclass(a)+unclass(b). It's implemented using pqR's internal "variant result" mechanism. Primitives such as "length" and "+" can ask for their arguments to be evaluated in such a way that an "unclassed" result is possibly returned with its class attribute still there, but with a flag set (not in the object) to indicate that it should be ignored. The variant result mechanism is also central to many other pqR improvements, including deferred evaluation to enable automatic use of multiple cores, and optimizations that allow fast evaluation of things like any(x<0), any(is.na(x)), or all(is.na(x)) without creation of intermediate results and with early termination when the result is determined. It is much better to use such a general mechanism that speeds up existing code than to implement more and more special-case functions like anyNA or some special function to allow length(unclass(x)) to be done quickly. The variant result mechanism has extremely low overhead, and is not hard to implement. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bias in R's random integers?
> From: Duncan Murdoch > Let's try it: > > > m <- (2/5)*2^32 > > m > 2^31 > [1] FALSE > > x <- sample(m, 100, replace = TRUE) > > table(x %% 2) > > 0 1 > 399850 600150 > > Since m is an even number, the true proportions of evens and odds should > be exactly 0.5. That's some pretty strong evidence of the bug in the > generator. It seems to be a recently-introduced bug. Here's output with R-2.15.1: R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows" Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-unknown-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > set.seed(123) > m <- (2/5)*2^32 > m > 2^31 [1] FALSE > x <- sample(m, 100, replace = TRUE) > table(x %% 2) 0 1 499412 500588 So I doubt that this has anything to do with bias from using 32-bit random values. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bias in R's random integers?
> Duncan Murdoch: > > and you can see it in the original m with > >x <- sample(m, 100, replace = TRUE) >plot(density(x[x %% 2 == 0])) OK. Thanks. I see there is a real problem. One option to fix it while mostly retaining backwards-compatibility would be to add extra bits from a second RNG call only when m is large - eg, larger than 2^27. That would retain reproducibility for most analyses of small to moderate size data sets. Of course, there would still be some small, detectable error for values a bit less than 2^27, but perhaps that's tolerable. (The 2^27 threshold could obviously be debated.) R Core made a similar decision in the case of sampling with replacement when implementing a new hashing algorithm that produces different results. It is enabled by default only when m > 1e7 and no more than half the values are to be sampled, as was noted: > Note that it wouldn't be the first time that sample() changes behavior > in a non-backward compatible way: > > https://stat.ethz.ch/pipermail/r-devel/2012-October/065049.html > > Cheers, > H. That incompatibility could have been avoided. A year ago I posted a fast hashing algorithm that produces the same results as the simple algorithm, here: https://stat.ethz.ch/pipermail/r-devel/2017-October/075012.html The latest version of this will be in the soon-to-be new release of pqR, and will of course enabled automatically whenever it seems desirable, for a considerable speed gain in many cases. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Subsetting row in single column matrix drops names in resulting vector
Dmitriy Selivanov (selivanov.dmit...@gmail.com) wrote: > Consider following example: > > a = matrix(1:2, nrow = 2, dimnames = list(c("row1", "row2"), c("col1"))) > a[1, ] > # 1 > > It returns *unnamed* vector `1` where I would expect named vector. In fact > it returns named vector when number of columns is > 1. > Same issue applicable to single row matrix. Is it a bug? looks very > counterintuitive. This and related issues are addressed in pqR, in the new release of 2018-11-18. (See pqR-project.org, and my blog post at radfordneal.wordpress.com) The behaviour of a[1,] is unchanged, for backwards compatibility reasons. But in pqR one can explicitly mark an argument as missing using "_". When an array subscript is missing in this way, the names will not be dropped in this context even if there is only one of them. So a[1,_] will do what you want: > a = matrix(1:2, nrow = 2, dimnames = list(c("row1", "row2"), c("col1"))) > a[1, ] [1] 1 > a[1,_] col1 1 Furthermore, pqR will not drop names when the subscript is a 1D array (ie, has a length-1 dim attribute) even if it is only one long. In pqR, sequences that are 1D arrays are easily created using the .. operator. So the following works as intended when .. is used, but not when the old : operator is used: > a = matrix(1:4, nrow=2, dimnames=list(c("row1","row2"),c("col1","col2"))) > n = 2 > a[1,1:n] col1 col2 13 > a[1,1..n] col1 col2 13 > n = 1 > a[1,1:n] [1] 1 > a[1,1..n] col1 1 You can read more about this in my blog post at https://radfordneal.wordpress.com/2016/06/25/fixing-rs-design-flaws-in-a-new-version-of-pqr/ That was written when most of these features where introduced, though getting your specific example right relies on another change introduced in the most recent version. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Subsetting row in single column matrix drops names in resulting vector
> > The behaviour of a[1,] is unchanged, for backwards compatibility > > reasons. But in pqR one can explicitly mark an argument as > > missing using "_". When an array subscript is missing in this way, > > the names will not be dropped in this context even if there is > > only one of them. So a[1,_] will do what you want: > > > >> a = matrix(1:2, nrow = 2, dimnames = list(c("row1", "row2"), > > c("col1"))) > >> a[1, ] > >[1] 1 > >> a[1,_] > >col1 > > 1 > To my mind, it's rather counterintuitive as > > > a[2,_] > col1 > 1 > so a[1,_] and a[2,_] have the same name. To make it intuitive (at least > for me ;) ) it should rather return names "row1" and "row2" respectively. > > Best, > Serguei. The aim in designing these features should be to make it easier to write reliable software, which doesn't unexpectedly fail in edge cases. Here, the fact that a is a matrix presumably means that the program is designed to work for more than one column - in fact, it's likely that the programmer was mostly thinking of the case where there is more than one column, and perhaps only testing that case. But of course there is usually no reason why one column (or even zero columns) is impossible. We want the program to still work in such cases. When there is more than one column, a[1,] and a[1,_] both produce a vector with the _column_ names attached, and this is certainly not going to change (nor should it, unless one wants to change the whole semantics of matrices so that rows and columns are treated non-symmetrically, and even then attaching the same row name to all the elements would be rather strange...). After v <- a[1,_], the program may well have an expression like v[nc] where nc is a column name. We want this to still work if there happens to be only one column. That will happen only if a[1,_] attaches a column name, not a row name, when a has only one column. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Inefficiency in df$col
While doing some performance testing with the new version of pqR (see pqR-project.org), I've encountered an extreme, and quite unnecessary, inefficiency in the current R Core implementation of R, which I think you might want to correct. The inefficiency is in access to columns of a data frame, as in expressions such as df$col[i], which I think are very common (the alternatives of df[i,"col"] and df[["col"]][i] are, I think, less common). Here is the setup for an example showing the issue: L <- list (abc=1:9, xyz=11:19) Lc <- L; class(Lc) <- "glub" df <- data.frame(L) And here are some times for R-3.5.2 (r-devel of 2019-02-01 is much the same): > system.time (for (i in 1:100) r <- L$xyz) user system elapsed 0.086 0.004 0.089 > system.time (for (i in 1:100) r <- Lc$xyz) user system elapsed 0.494 0.000 0.495 > system.time (for (i in 1:100) r <- df$xyz) user system elapsed 3.425 0.000 3.426 So accessing a column of a data frame is 38 times slower than accessing a list element (which is what happens in the underlying implementation of a data frame), and 7 times slower than accessing an element of a list with a class attribute (for which it's necessary to check whether there is a $.glub method, which there isn't here). For comparison, here are the times for pqR-2019-01-25: > system.time (for (i in 1:100) r <- L$xyz) user system elapsed 0.057 0.000 0.058 > system.time (for (i in 1:100) r <- Lc$xyz) user system elapsed 0.251 0.000 0.251 > system.time (for (i in 1:100) r <- df$xyz) user system elapsed 0.247 0.000 0.247 So when accessing df$xyz, R-3.5.2 is 14 times slower than pqR-2019-01-25. (For a partial match, like df$xy, R-3.5.2 is 34 times slower.) I wasn't surprised that pqR was faster, but I didn't expect this big a difference. Then I remembered having seen a NEWS item from R-3.1.0: * Partial matching when using the $ operator _on data frames_ now throws a warning and may become defunct in the future. If partial matching is intended, replace foo$bar by foo[["bar", exact = FALSE]]. and having looked at the code then: `$.data.frame` <- function(x,name) { a <- x[[name]] if (!is.null(a)) return(a) a <- x[[name, exact=FALSE]] if (!is.null(a)) warning("Name partially matched in data frame") return(a) } I recall thinking at the time that this involved a pretty big performance hit, compared to letting the primitive $ operator do it, just to produce a warning. But it wasn't until now that I noticed this NEWS in R-3.1.1: * The warning when using partial matching with the $ operator on data frames is now only given when options("warnPartialMatchDollar") is TRUE. for which the code was changed to: `$.data.frame` <- function(x,name) { a <- x[[name]] if (!is.null(a)) return(a) a <- x[[name, exact=FALSE]] if (!is.null(a) && getOption("warnPartialMatchDollar", default=FALSE)) { names <- names(x) warning(gettextf("Partial match of '%s' to '%s' in data frame", name, names[pmatch(name, names)])) } return(a) } One can see the effect now when warnPartialMatchDollar is enabled: > options(warnPartialMatchDollar=TRUE) > Lc$xy [1] 11 12 13 14 15 16 17 18 19 Warning message: In Lc$xy : partial match of 'xy' to 'xyz' > df$xy [1] 11 12 13 14 15 16 17 18 19 Warning message: In `$.data.frame`(df, xy) : Partial match of 'xy' to 'xyz' in data frame So the only thing that slowing down acesses like df$xyz by a factor of seven achieves now is to add the words "in data frame" to the warning message (while making the earlier part of the message less intelligible). I think you might want to just delete the definition of $.data.frame, reverting to the situation before R-3.1.0. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Inefficiency in df$col
> > I think you might want to just delete the definition of $.data.frame, > > reverting to the situation before R-3.1.0. > > I imagine the cause is that the list version is done in C code rather > than R code (i.e. there's no R function `$.list`). So an alternative > solution would be to also implement `$.data.frame` in the underlying C > code. This won't be quite as fast (it needs that test for NULL), but > should be close in the full match case. I maybe wasn't completely clear. The $ operator for data frames was previously done in C - since it was done by the same primitive as for lists. In R-3.1.0, this was changed - producing a massive slowdown - for the purpose of giving a warning on partial matches even if the user had not set the warnPartialMatchDollar option to TRUE. In R-3.1.1, this was changed to not warn unless warnPartialMatchDollar was TRUE which was the PREVIOUS behaviour. In other words, this change reverted the change made in R-3.1.0. But instead of simply deleting the definition of $.data.frame, R-3.1.1 added extra code to it, the only effect of which is to slightly change the wording of the warning message from what is produced for any other list, while still retaining the massive slowdown. There is no need for you to write $.data.frame in C. You just need to delete the version written in R. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bug: time complexity of substring is quadratic
> From: Tomas Kalibera > > Thanks for the report, I am working on a patch that will address this. > > I confirm there is a lot of potential for speedup. On my system, > > 'N=20; x <- substring(paste(rep("A", N), collapse=""), 1:N, 1:N)' > > spends 96% time in checking if the string is ascii and 3% in strlen(); > if we take advantage of the pre-computed value in the ASCII bit, the > speed up is about 40x. The latest version of pqR (at pqR-project.org) has changes that considerably speed up both this and other string operations. Here's a test (both compiled with gcc 8.2.0 with -O3 on a Skylake processor). R-3.5.2: > N=20; system.time(for (i in 1:100) r<-paste(rep("A",N),collapse="")) user system elapsed 1.548 0.023 1.572 > system.time(for (i in 1:10) x<-substring(r,1:N,1:N)) user system elapsed 4.462 0.016 4.478 pqR-2019-02-19: > N=20; system.time(for (i in 1:100) r<-paste(rep("A",N),collapse="")) user system elapsed 0.318 0.071 0.388 > system.time(for (i in 1:10) x<-substring(r,1:N,1:N)) user system elapsed 0.041 0.000 0.041 Some of this may be due to pqR's faster garbage collector - R Core implementatons have a particular GC problem with strings, as explained at https://radfordneal.wordpress.com/2018/11/29/faster-garbage-collection-in-pqr/ But there are also some specific improvements to string operations that you might want to have a look at. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Speed improvement to evalList
I've been inspired to look at the R source code by some strange timing results that I wrote about on my blog at radfordneal.wordpress.com (see the posts on "Speeding up parentheses..." and "Two surprising things...". I discovered that the strange speed advantage of curly brackets over parentheses is partially explained by an inefficiency in the evalList and evalListKeepMissing procedures in eval.c, in directory src/main, which are on the critical path for many operations. These procedures unnecessarily allocate an extra CONS node. I rewrote them to avoid this, which seems to speed up a typical program by about 5% (assuming it doesn't spend most of its time in things like matrix multiplies). I think it would be well worthwhile to put this minor change into the next R release. I'll be looking at some other places where R can also be sped up, and expect that an average improvement of maybe 15% is possible, with some programs probably speeding up by a factor of two. For now, though, I'll just give the revised versions of evalList and evalListKeepMissing, below. Radford Neal - /* Used in eval and applyMethod (object.c) for builtin primitives, do_internal (names.c) for builtin .Internals and in evalArgs. 'n' is the number of arguments already evaluated and hence not passed to evalArgs and hence to here. */ SEXP attribute_hidden evalList(SEXP el, SEXP rho, SEXP call, int n) { SEXP head, tail, ev, h; int mode; /* mode==0 is 0 args, mode==1 is 1 arg, mode==2 is >1 arg */ head = R_NilValue; mode = 0; while (el != R_NilValue) { n++; if (CAR(el) == R_DotsSymbol) { /* If we have a ... symbol, we look to see what it is bound to. * If its binding is Null (i.e. zero length) * we just ignore it and return the cdr with all its expressions evaluated; * if it is bound to a ... list of promises, * we force all the promises and then splice * the list of resulting values into the return value. * Anything else bound to a ... symbol is an error */ h = findVar(CAR(el), rho); if (TYPEOF(h) == DOTSXP || h == R_NilValue) { while (h != R_NilValue) { if (mode==1) { PROTECT(head); mode = 2; } ev = CONS(eval(CAR(h), rho), R_NilValue); COPY_TAG(ev, h); if (mode==0) { head = ev; mode = 1; } else { SETCDR(tail, ev); } tail = ev; h = CDR(h); } } else if (h != R_MissingArg) error(_("'...' used in an incorrect context")); } else if (CAR(el) == R_MissingArg) { /* It was an empty element: most likely get here from evalArgs which may have been called on part of the args. */ errorcall(call, _("argument %d is empty"), n); } else if (isSymbol(CAR(el)) && R_isMissing(CAR(el), rho)) { /* It was missing */ errorcall(call, _("'%s' is missing"), CHAR(PRINTNAME(CAR(el; } else { if (mode==1) { PROTECT(head); mode = 2; } ev = CONS(eval(CAR(el), rho), R_NilValue); COPY_TAG(ev, el); if (mode==0) { head = ev; mode = 1; } else { SETCDR(tail, ev); } tail = ev; } el = CDR(el); } if (mode==2) UNPROTECT(1); return head; } /* evalList() */ /* A slight variation of evaluating each expression in "el" in "rho". */ /* used in evalArgs, arithmetic.c, seq.c */ SEXP attribute_hidden evalListKeepMissing(SEXP el, SEXP rho) { SEXP head, tail, ev, h; int mode; /* mode==0 is 0 args, mode==1 is 1 arg, mode==2 is >1 arg */ head = R_NilValue; mode = 0; while (el != R_NilValue) { /* If we have a ... symbol, we look to see what it is bound to. * If its binding is Null (i.e. zero length) * we just ignore it and return the cdr with all its expressions evaluated; * if it is bound to a ... list of promises, * we force all the promises and then splice * the list of resulting values into the return value. * Anything else bound to a ... symbol is an error */ if (CAR(el) == R_DotsSymbol) { h = findVar(CAR(el), rho); if (TYPEOF(h) == DOTSXP
[Rd] Speed improvement to PROTECT, UNPROTECT, etc.
As I mentioned in my previous message about speeding up evalList, I've been looking at ways to speed up the R interpreter. One sees in the code many, many calls of PROTECT, UNPROTECT, and related functions, so that seems like an obvious target for optimization. Indeed, I've found that one can speed up the interpreter by about 10% by just changing these. The functions are actually macros defined in Rinternals.h, but end up just calling functions defined in memory.c (apparently as protect, etc., but really as Rf_protect, etc.). So there is function call overhead every time they are used. To get rid of the function call overhead, without generating lots of extra code, one can redefine the macros to handled to the common case inline, and call the function in memory.c for the uncommon cases (eg, error on stack underflow). Here are my versions that do this: #define PROTECT(s) do { \ SEXP tmp_prot_sexp = (s); \ if (R_PPStackTop < R_PPStackSize) \ R_PPStack[R_PPStackTop++] = tmp_prot_sexp; \ else \ Rf_protect(tmp_prot_sexp); \ } while (0) #define UNPROTECT(n) (R_PPStackTop >= (n) ? \ (void) (R_PPStackTop -= (n)) : Rf_unprotect(n)) #define PROTECT_WITH_INDEX(x,i) do { \ PROTECT(x); \ *(i) = R_PPStackTop - 1; \ } while (0) #define REPROTECT(x,i) ( (void) (R_PPStack[i] = x) ) Unfortunately, one can't just change the definitions in Rinternals.h. Some uses of PROTECT are in places where R_PPStack, etc. aren't visible. Instead, one can redefine them at the end of Defn.h, where these variables are declared. That alone also doesn't work, however, because some references don't work at link time. So instead I redefine them in Defn.h only if USE_FAST_PROTECT_MACROS is defined. I define this before including Defn.h in all the .c files in the main directory. Another complication is that the PROTECT macro no longer returns its argument. One can avoid this by writing it another way, but this then results in its argument being referenced in two places (though only evaluated once), which seems to slow things down, presumably because the larger amount of code generated affects cache behaviour. Instead, I just changed the relatively few occurrences of v = PROTECT(...) to PROTECT(v = ...), which is the dominant idiom in the code anyway. (In some places slightly more change is needed, as when this is in an initializer.) This works fine, speeding up R programs that aren't dominated by large operations like multiplying big matrices by about 10%. The effect is cumulative with the change I mentioned in my previous message about avoiding extra CONS operations in evalList, for a total speedup of about 15%. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Speed improvement to evalList
Regarding my suggesting speed improvement to evalList, Martin Morgan has commented by email to me that at one point an object is left unprotected when COPY_TAG is called, and has wondered whether that is safe. I think it is safe, but the code can be changed to protect this as well, which actually simplifies things, and could be more robust to changes to the garbage collector. The cost is that sometimes there is one more call of PROTECT and UNPROTECT, but with the speed improvement to these that I just posted, this is a minor issue. Martin has also pointed me to where you can get R sources via subversion, but while I figure that out, and how to post up "diffs" for changes, I'll put the revised evalList code below for anyone interested... Radford Neal -- /* Used in eval and applyMethod (object.c) for builtin primitives, do_internal (names.c) for builtin .Internals and in evalArgs. 'n' is the number of arguments already evaluated and hence not passed to evalArgs and hence to here. */ SEXP attribute_hidden evalList(SEXP el, SEXP rho, SEXP call, int n) { SEXP head, tail, ev, h; head = R_NilValue; while (el != R_NilValue) { n++; if (CAR(el) == R_DotsSymbol) { /* If we have a ... symbol, we look to see what it is bound to. * If its binding is Null (i.e. zero length), * we just ignore it and return the cdr with all its expressions * evaluated. * If it is bound to a ... list of promises, * we force all the promises and then splice * the list of resulting values into the return value. * Anything else bound to a ... symbol is an error. */ h = findVar(CAR(el), rho); if (TYPEOF(h) == DOTSXP || h == R_NilValue) { while (h != R_NilValue) { ev = CONS(eval(CAR(h), rho), R_NilValue); if (head==R_NilValue) PROTECT(head = ev); else SETCDR(tail, ev); COPY_TAG(ev, h); tail = ev; h = CDR(h); } } else if (h != R_MissingArg) error(_("'...' used in an incorrect context")); } else if (CAR(el) == R_MissingArg) { /* It was an empty element: most likely get here from evalArgs which may have been called on part of the args. */ errorcall(call, _("argument %d is empty"), n); } else if (isSymbol(CAR(el)) && R_isMissing(CAR(el), rho)) { /* It was missing */ errorcall(call, _("'%s' is missing"), CHAR(PRINTNAME(CAR(el; } else { ev = CONS(eval(CAR(el), rho), R_NilValue); if (head==R_NilValue) PROTECT(head = ev); else SETCDR(tail, ev); COPY_TAG(ev, el); tail = ev; } el = CDR(el); } if (head!=R_NilValue) UNPROTECT(1); return head; } /* evalList() */ /* A slight variation of evaluating each expression in "el" in "rho". */ /* used in evalArgs, arithmetic.c, seq.c */ SEXP attribute_hidden evalListKeepMissing(SEXP el, SEXP rho) { SEXP head, tail, ev, h; head = R_NilValue; while (el != R_NilValue) { /* If we have a ... symbol, we look to see what it is bound to. * If its binding is Null (i.e. zero length) * we just ignore it and return the cdr with all its expressions evaluated; * if it is bound to a ... list of promises, * we force all the promises and then splice * the list of resulting values into the return value. * Anything else bound to a ... symbol is an error */ if (CAR(el) == R_DotsSymbol) { h = findVar(CAR(el), rho); if (TYPEOF(h) == DOTSXP || h == R_NilValue) { while (h != R_NilValue) { if (CAR(h) == R_MissingArg) ev = CONS(R_MissingArg, R_NilValue); else ev = CONS(eval(CAR(h), rho), R_NilValue); if (head==R_NilValue) PROTECT(head = ev); else SETCDR(tail, ev); COPY_TAG(ev, h); tail = ev; h = CDR(h); } } else if(h != R_MissingArg) error(_("'...' used in an incorrect context")); } else { if (CAR(el) == R_MissingArg || (isSymbol(CAR(el)) && R_isMissing(CAR(el), rho))) ev = CONS(R_Mi
[Rd] Speeding up sum and prod
Looking for more ways to speed up R, I've found that large improvements are possible in the speed of "sum" and "prod" for long real vectors. Here is a little test with R version 2.11.1 on an Intel Linux system > a <- seq(0,1,length=1000) > system.time({for (i in 1:100) b <- sum(a)}) user system elapsed 4.800 0.010 4.817 > system.time({for (i in 1:100) b <- sum(a,na.rm=TRUE)}) user system elapsed 8.240 0.030 8.269 and here is the same with "sum" and "prod" modified as described below: > a <- seq(0,1,length=1000) > system.time({for (i in 1:100) b <- sum(a)}) user system elapsed 1.810.001.81 > system.time({for (i in 1:100) b <- sum(a,na.rm=TRUE)}) user system elapsed 7.250 0.010 7.259 That's an improvement by a factor of 2.65 for real vectors of length 1000 with na.rm=FALSE (the default), and an improvement of 12% when na.rm=TRUE. Of course, the improvement is smaller for very short vectors. The biggest reason for the improvement is that the current code (in 2.11.1 and in the development release of 2010-08-19) makes a costly call of ISNAN even when the option is na.rm=FALSE. The inner loop can also be sped up a bit in other respects. Here is the old procedure, in src/main/summary.c: static Rboolean rsum(double *x, int n, double *value, Rboolean narm) { LDOUBLE s = 0.0; int i; Rboolean updated = FALSE; for (i = 0; i < n; i++) { if (!ISNAN(x[i]) || !narm) { if(!updated) updated = TRUE; s += x[i]; } } *value = s; return(updated); } and here is my modified version: static Rboolean rsum(double *x, int n, double *value, Rboolean narm) { LDOUBLE s = 0.0; int i; Rboolean updated = FALSE; if (narm) { for (i = 0; i < n; i++) { if (!ISNAN(x[i])) { s += x[i]; updated = TRUE; break; } } for (i = i+1; i < n; i++) { if (!ISNAN(x[i])) s += x[i]; } } else { for (i = 0; i < n; i++) s += x[i]; if (n>0) updated = TRUE; } *value = s; return(updated); } An entirely analogous improvement can be made to the "prod" function. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Speeding up matrix multiplies
I've looked at the code for matrix multiplies in R, and found that it can speeded up quite a bit, for multiplies that are really vector dot products, and for other multiplies in which the result matrix is small. Here's my test program: u <- seq(0,1,length=1000) v <- seq(0,2,length=1000) A2 <- matrix(2.1,2,1000) A5 <- matrix(2.1,5,1000) B3 <- matrix(3.2,1000,3) A4 <- matrix(2.1,4,1000) B4 <- matrix(3.2,1000,4) print(system.time ({for (i in 1:10) w <- u %*% v})) print(system.time ({for (i in 1:10) w <- A5 %*% v})) print(system.time ({for (i in 1:10) R <- A2 %*% B3})) print(system.time ({for (i in 1:10) R <- A5 %*% B3})) print(system.time ({for (i in 1:10) R <- A4 %*% B4})) Here are the five system.time results above using R 2.11.1: user system elapsed 1.680 0.000 1.683 user system elapsed 4.350 0.000 4.349 user system elapsed 4.820 0.000 4.818 user system elapsed 7.370 0.020 7.383 user system elapsed 7.990 0.000 7.991 and here are the results with my modified version of R 2.11.1: user system elapsed 0.270 0.000 0.271 user system elapsed 1.290 0.000 1.287 user system elapsed 1.080 0.000 1.086 user system elapsed 3.640 0.000 3.642 user system elapsed 7.950 0.000 7.953 The main issue here is again ISNAN. The R code doesn't trust the Fortran routine it calls to handle NA and NaN correctly, so it checks whether the arguments have any NAs or NaNs, and if so does the matrix multiply itself, with simple nested loops. For multiplies in which the number of elements in the result matrix is large, this check will take a negligible amount of time. But otherwise it can drastically slow things down. This is true in particular for dot products, of a 1xN times an Nx1 matrix, giving a 1x1 result. I changed the code to handle dot product specially, with no need for ISNAN checks, and otherwise to go straight to the nested loop code (bypassing Fortran and the ISNAN checks) if the result matrix has no more than 15 elements. One can see that the times are much improved above, except of course for the multiply that produces a 4x4 result, which takes the same amount of time as before, since nothing is different for that. From the results, it seems that a threshold a bit above 15 might be even better, though this will of course depend on the architecture, compiler, etc. Simon Urbaneck points out in reply to my message about speeding up "sum" and "prod" that the effects of changes like this depend on the machine architecure, compiler, etc. This is of course true. The magnitude of variation in his results (not the absolute times, but the relative times of different versions) is rather larger than one would expect, though. Perhaps his time measurements were affected by random factors like system load, or perhaps they are due to non-random but essentially arbitrary factors like the exact alignment of code generated by the C compiler, which could change with even unrelated modifications to the program. The changes that I made seem generally desirable for any architecture, or will at least make things no worse (the code expansion is trivial), apart from the possibility of such arbitrary factors. I'm not doing anything like manually unrolling loops, or replacing array indexing with pointer arithmetic, which might be seen as undesirable attempts at out-smarting the optimizer. The times above are for an Intel Linux machine with gcc version below: Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --enable-targets=all --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu4) I didn't change any R configuration options regarding compilation. Below is my modified version of the matprod function in src/main/array.c. Radford Neal static void matprod(double *x, int nrx, int ncx, double *y, int nry, int ncy, double *z) { char *transa = "N", *transb = "N"; int i, j, k; double one = 1.0, zero = 0.0; LDOUBLE sum; Rboolean do_it_here; /* NOTE: ncx must be equal to nry. */ if (nrx==1 && ncy==1) { /* Do dot product quickly. */ sum = 0.0; for (i = 0; i 0 && ncx > 0 && nry > 0 && ncy > 0) { /* Faster to just do it here if the result is small, especi
Re: [Rd] Speeding up matrix multiplies
Regarding my previous message on speeding up matrix multiplies, I've realized that the size of the result matrix is not really the right criterion for deciding to do the computation without using the Fortran routine. A better criterion would be based on the ratio of the time for the matrix multiply to the time for the ISNAN check. So I've replaced my previous check by /* Faster to just do it here if the time to do the matrix multiply is not much greater than the time for the NA/NaN check below. */ do_it_here = nrx*ncy <= 5*(nrx+ncy); In particular, all matrix x vector and vector x matrix products will in this version be done in the matprod routine, not the Fortran routine. And this is the right thing to do, since the time for the ISNAN check before calling the Fortan routine will be comparable to the time for the matrix multiply. So avoiding it will be desirable unless the Fortran routine is really, really faster than the C code in matprod. Of course, the real solution is to change the Fortran routine (which seems to be in src/extra/blas/blas.f, and isn't all that complicated) to handle NAs and NaNs as desired. Or to write a carefully optimized C matrix multiply routine that does the right thing. For this, knowing what the problem is supposed to be would help. My quick attempt at using the bug reporting system didn't manage to find the PR#4582 that is mentioned in the comment in the code, but I don't really know how the bug report system is supposed to work. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Speeding up matrix multiplies
> On Aug 23, 2010, at 7:39 PM, Radford Neal wrote: > > In particular, all matrix x vector and vector x matrix products will > > in this version be done in the matprod routine, not the Fortran routine. > > And this is the right thing to do, since the time for the ISNAN check > > before calling the Fortan routine will be comparable to the time for > > the matrix multiply. So avoiding it will be desirable unless the Fortran > > routine is really, really faster than the C code in matprod. > > It is, many times in fact, if you use threaded BLAS on a multi-core machine > and large matrices. Well, it can't get any faster than taking zero time. And even in that case, using the C code in matprod will slow the program down (compared to the existing version of R) by only about a factor of two. > > Of course, the real solution is to change the Fortran routine (which > > seems to be in src/extra/blas/blas.f, and isn't all that complicated) > > Nope - it can be external and BLAS standard doesn't handle NAs. OK. I see the problem if people can link in their own version of BLAS. I wonder if there is any way of telling whether they did that? Presumably many people will use the BLAS routine supplied with R, which could be made safe for NAs. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Correction to section 1.1.2 of R Internals doc, on NAMED
I think the explanation of the NAMED field in the R Internals document is incorrect. In Section 1.1.2, it says: The named field is set and accessed by the SET_NAMED and NAMED macros, and take values 0, 1 and 2. R has a `call by value' illusion, so an assignment like b <- a appears to make a copy of a and refer to it as b. However, if neither a nor b are subsequently altered there is no need to copy. What really happens is that a new symbol b is bound to the same value as a and the named field on the value object is set (in this case to 2). When an object is about to be altered, the named field is consulted. A value of 2 means that the object must be duplicated before being changed. (Note that this does not say that it is necessary to duplicate, only that it should be duplicated whether necessary or not.) A value of 0 means that it is known that no other SEXP shares data with this object, and so it may safely be altered. A value of 1 is used for situations like dim(a) <- c(7, 2) where in principle two copies of a exist for the duration of the computation as (in principle) a <- `dim<-`(a, c(7, 2)) but for no longer, and so some primitive functions can be optimized to avoid a copy in this case. The implication of this somewhat confusing explanation is that values of variables may have NAMED of 0, and that NAMED will be 1 only briefly, during a few operations like dim(a) <- c(7,2). But from my reading of the R source, this is wrong. It seems to me that NAMED will quite often be 1 for extended periods of time. For instance, after a <- c(7,2), the value stored in a will have NAMED of 1. If at this point a[2] <- 0 is executed, no copy is made, because NAMED is 1. If b <- a is then executed, the same value will be in both a and b, and to reflect this, NAMED is incremented to 2. If a[2] <- 0 is executed at this point, a copy is made, since NAMED is 2. Essentially, NAMED is a count of how many variables reference a value, except it's not necessarily accurate. First, once NAMED reaches 2, it doesn't get incremented any higher. Second, no attempt is made to decrement NAMED when a variable ceases to refer to a value. So the end result is that a copy needs to be made when changing a variable whose value has NAMED of 2, since it's possible that some other variable references the same value. There seems to be some confusion in the R source on this. In the do_for procedure, the value for the for loop variable is set up with NAMED being 0, though according to my explanation above, it ought to be set up with NAMED of 1. A bug is avoided here only because the procedures for getting values from variables check if NAMED is 0, and if so fix it up to being 1, which is the minimum that it ought to be for a value that's stored in a variable. Is my understanding of this correct? Or have I missed something? Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Speeding up transpose
I've looked at how to speed up the transpose function in R (ie, t(X)). The existing code does the work with loops like the following: for (i = 0; i < len; i++) REAL(r)[i] = REAL(a)[(i / ncol) + (i % ncol) * nrow]; It seems a bit optimistic to expect a compiler to produce good code from this. I've re-written these loops as follows: for (i = 0, j = 0; i=len) j -= (len-1); REAL(r)[i] = REAL(a)[j]; } The resulting improvement is sometimes dramatic. Here's a test program: M <- matrix(seq(0,1,12000),200,60) print(system.time({for (i in 1:1) S <- t(M)})) print(system.time({for (i in 1:1) R <- t(S)})) v <- seq(0,2,12000) print(system.time({for (i in 1:10) u <- t(v)})) print(system.time({for (i in 1:10) w <- t(u)})) Here are the times on an Intel Linux system: R version 2.11.1:Modified version: user system elapsed user system elapsed 1.190 0.040 1.232 0.610 0.010 0.619 user system elapsed user system elapsed 1.200 0.020 1.226 0.610 0.000 0.616 user system elapsed user system elapsed 0.800 0.010 0.813 0.750 0.000 0.752 user system elapsed user system elapsed 0.910 0.010 0.921 0.860 0.000 0.864 Here are the times on a SPARC Solaris system: R version 2.11.1:Modified version: user system elapsed user system elapsed 18.643 0.041 18.685 2.994 0.039 3.033 user system elapsed user system elapsed 18.574 0.041 18.616 3.123 0.039 3.163 user system elapsed user system elapsed 3.803 0.271 4.075 3.868 0.296 4.163 user system elapsed user system elapsed 4.184 0.273 4.457 4.238 0.302 4.540 So with the modification, transpose for a 60x200 or 200x60 matrix is about a factor of two faster on the Intel system, and a factor of six faster on the SPARC system. There is little or no gain from the modification when transposing a row or column vector, however. (I think it must be that on these machines multiplies and divides do not take constant time, but are faster when, for instance, dividing by 1.) I've appended below the new version of the modified part of the do_transpose function in src/main/array.c. Radford Neal -- PROTECT(r = allocVector(TYPEOF(a), len)); switch (TYPEOF(a)) { case LGLSXP: case INTSXP: for (i = 0, j = 0; i=len) j -= (len-1); INTEGER(r)[i] = INTEGER(a)[j]; } case REALSXP: for (i = 0, j = 0; i=len) j -= (len-1); REAL(r)[i] = REAL(a)[j]; } break; case CPLXSXP: for (i = 0, j = 0; i=len) j -= (len-1); COMPLEX(r)[i] = COMPLEX(a)[j]; } break; case STRSXP: for (i = 0, j = 0; i=len) j -= (len-1); SET_STRING_ELT(r, i, STRING_ELT(a,j)); } break; case VECSXP: for (i = 0, j = 0; i=len) j -= (len-1); SET_VECTOR_ELT(r, i, VECTOR_ELT(a,j)); } break; case RAWSXP: for (i = 0, j = 0; i=len) j -= (len-1); RAW(r)[i] = RAW(a)[j]; } break; default: UNPROTECT(1); goto not_matrix; } __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Speeding up transpose
> I've appended below the new version of the modified part of the > do_transpose function in src/main/array.c. A quick correction... A "break" went missing from the case INTSXP: section of the code I posted. Corrected version below. Radford Neal -- PROTECT(r = allocVector(TYPEOF(a), len)); switch (TYPEOF(a)) { case LGLSXP: case INTSXP: for (i = 0, j = 0; i=len) j -= (len-1); INTEGER(r)[i] = INTEGER(a)[j]; } break; case REALSXP: for (i = 0, j = 0; i=len) j -= (len-1); REAL(r)[i] = REAL(a)[j]; } break; case CPLXSXP: for (i = 0, j = 0; i=len) j -= (len-1); COMPLEX(r)[i] = COMPLEX(a)[j]; } break; case STRSXP: for (i = 0, j = 0; i=len) j -= (len-1); SET_STRING_ELT(r, i, STRING_ELT(a,j)); } break; case VECSXP: for (i = 0, j = 0; i=len) j -= (len-1); SET_VECTOR_ELT(r, i, VECTOR_ELT(a,j)); } break; case RAWSXP: for (i = 0, j = 0; i=len) j -= (len-1); RAW(r)[i] = RAW(a)[j]; } break; default: UNPROTECT(1); goto not_matrix; } __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Pointer to fourteen patches to speed up R
I've continued to work on speeding up R, and now have a collection of fourteen patches, some of which speed up particular functions, and some of which reduce general interpretive overhead. The total speed improvement from these patches is substantial. It varies a lot from one R program to the next, of course, and probably from one machine to the next, but speedups of 25% can be expected in many programs, and sometimes much more (though sometimes less as well). The fourteen patches work for revision r52822 of the development version of R (I haven't check against any changes in the last few days), and also for release 2.11.1. I also wrote a number of timing test programs. I tried posting this message with the patches and test programs attached, but it got held for moderation because it's over 128K. I'll leave it uncancelled, since maybe it's good to archive them here, but in the interim, you can get both the patches and the test programs from http://www.cs.toronto.edu/~radford/R-mods.html I've included below the documentation on what each patch does, which is also in "doc" in speed-patches.tar. Note that I fixed a few minor bugs along the way. There looks to be scope for more improvements in various parts of the R interpreter that I didn't get to. I'll have to put this on hold for now, however, to spend my time preparing for the coming teaching term. I'd be happy to hear of any comments on these patches, though, including information on how much they speed up typical programs, on various machines. Radford Neal --- These patches to the R source for improving speed were written by Radford M. Neal, Sept. 2010. See the README file for how to install them. Below, I describe these patches (in alphabetical order), indicate what improvement they produce, and also mention any potential issues with using the patch, and bugs that the patches incidently fix. The timing improvements discussed below are what is obtained by applying each patch individually, on an Intel system running Ubuntu Linux with Gcc version 4.2.4. The total improvement from all patches is much bigger, though in a few instances a patch can diminish the effect of another patch, by reducing the magnitude of the inefficiencies that the other patch eliminates. Note though, that the percentage improvement for a given absolute improvement gets bigger as when other patches reduce overall time. For r52822, the total time for all tests in the accompanying speed test suite is 674 seconds. This is reduced to 487 seconds with all patches applied, a reduction of 28%. Particular R programs will, of course, see widely varying reductions depending on what operations they mostly do. patch-dollar Speeds up access to lists, pairlists, and environments using the $ operator. The speedup comes mainly from avoiding the overhead of calling DispatchOrEval if there are no complexities, from passing on the field to extract as a symbol, or a name, or both, as available, and then converting only as necessary, from simplifying and inlining the pstrmatch procedure, and from not translating string multiple times. Relevant timing test script: test-dollar.r This test shows about a 40% decrease in the time needed to extract elements of lists and environments. Changes unrelated to speed improvement: A small error-reporting bug is fixed, illustrated by the following output with r52822: > options(warnPartialMatchDollar=TRUE) > pl <- pairlist(abc=1,def=2) > pl$ab [1] 1 Warning message: In pl$ab : partial match of 'ab' to '' Some code is changed at the end of R_subset3_dflt because it seems to be more correct, as discussed in code comments. patch-evalList Speeds up a large number of operations by avoiding allocation of an extra CONS cell in the procedures for evaluating argument lists. Relevant timing test scripts: all of them, but will look at test-em.r On test-em.r, the speedup from this patch is about 5%. patch-fast-base Speeds up lookup of symbols defined in the base environment, by flagging symbols that have a base environment definition recorded in the global cache. This allows the definition to be retrieved quickly without looking in the hash table. Relevant timing test scripts: all of them, but will look at test-em.r On test-em.r, the speedup from this patch is about 3%. Issue: This patch uses the "spare" bit for the flag. This bit is misnamed, since it is already used elsewhere (for closures). It is possible that one of the "gp" bits should be used instead. The "gp" bits should really be divided up for faster access, and so that their present use is apparent in the code. In case this us
[Rd] Correction to vec-subset speed patch
I found a bug in one of the fourteen speed patches I posted, namely in patch-vec-subset. I've fixed this (I now see one does need to duplicate index vectors sometimes, though one can avoid it most of the time). I also split this patch in two, since it really has two different and independent parts. The patch-vec-subset patch now has only some straightforward (locally-checkable) speedups for copies. The new patch-subscript patch has the speedups for creation of index vectors, which is where the bug was, and which generally have more global interactions. I made some other changes in the patch-subscript part along with fixing the bug. I've attached the new versions of the patches. Here is the documentation for the two revised patches: patch-vec-subset Speeds up extraction of subsets of vectors or matrices (eg, v[10:20] or M[1:10,101:110]). This is done with detailed code improvements. Relevant test script: test-vec-subset.r There are lots of tests in this script. The most dramatic improvement is for extracting many rows and columns of a large array, where the improvement is by about a factor of four. Extracting many rows from one column of a matrix is sped up by about 30%. Changes unrelated to speed improvement: Fixes two latent bugs where the code incorrectly refers to NA_LOGICAL when NA_INTEGER is appropriate and where LOGICAL and INTEGER types are treated as interchangeable. These cause no problems at the moment, but would if representations were changed. patch-subscript (Formerly part of patch-vec-subset) This patch also speeds up extraction, and also replacement, of subsets of vectors or matrices, but focuses on the creation of the indexes rather than the copy operations. Often avoids a duplication (see below) and eliminates a second scan of the subscript vector for zero subscripts, folding it into a previous scan at no additional cost. Relevant test script: test-vec-subset.r Speeds up some operations with scalar or short vector indexes by about 10%. Speeds up subscripting with a longer vector of positive indexes by about 20%. Issues: The current code duplicates a vector of indexes when it seems unnecessary. Duplication is for two reasons: to handle the situation where the index vector is itself being modified in a replace operation, and so that any attributes can be removed, which is helpful only for string subscripts, given how the routine to handle them returns information via an attribute. Duplication for the second reasons can easily be avoided, so I avoided it. The first reason for duplication is sometimes valid, but can usually be avoided by first only doing it if the subscript is to be used for replacement rather than extraction, and second only doing it if the NAMED field for the subscript isn't zero. I also removed two layers of procedure call overhead (passing seven arguments, so not trivial) that seemed to be doing nothing. Probably it used to do something, but no longer does, but if instead it is preparation for some future use, then removing it might be a mistake. Index: src/main/subset.c === --- src/main/subset.c (revision 52822) +++ src/main/subset.c (working copy) @@ -59,73 +59,77 @@ if (x == R_NilValue) return x; -for (i = 0; i < n; i++) { - ii = INTEGER(indx)[i]; - if (ii != NA_INTEGER) - ii--; - switch (mode) { - case LGLSXP: - if (0 <= ii && ii < nx && ii != NA_LOGICAL) - LOGICAL(result)[i] = LOGICAL(x)[ii]; - else - LOGICAL(result)[i] = NA_INTEGER; - break; - case INTSXP: - if (0 <= ii && ii < nx && ii != NA_INTEGER) - INTEGER(result)[i] = INTEGER(x)[ii]; - else - INTEGER(result)[i] = NA_INTEGER; - break; - case REALSXP: - if (0 <= ii && ii < nx && ii != NA_INTEGER) - REAL(result)[i] = REAL(x)[ii]; - else - REAL(result)[i] = NA_REAL; - break; - case CPLXSXP: - if (0 <= ii && ii < nx && ii != NA_INTEGER) { - COMPLEX(result)[i] = COMPLEX(x)[ii]; - } - else { - COMPLEX(result)[i].r = NA_REAL; - COMPLEX(result)[i].i = NA_REAL; - } - break; - case STRSXP: - if (0 <= ii && ii < nx && ii != NA_INTEGER) - SET_STRING_ELT(result, i, STRING_ELT(x, ii)); - else - SET_STRING_ELT(result, i, NA_STRING); - break; - case VECSXP: - case EXPRSXP: - if (0 <= ii && ii < nx && ii != NA_INTEGER) - SET_VECTOR_ELT(result, i, VECTOR_ELT(x, ii)); - else - SET_VECTOR_ELT(result, i, R_NilValue); - br
[Rd] Speeding up squaring of vectors
I see that some of the speed patches that I posted have been incorporated into the current development version (eg, my patch-for, patch-evalList, and patch-vec-arith). My patch for speeding up x^2 has been addressed in an inadvisable way, however. This was a simple addition of four lines of code that speeds up squaring of real vectors by a factor of about six (for vectors of length 1), by just converting x^2 to x*x. Modifications in the current development version (r52936 and r52937) attempt to address this issue in a different way, but they produce a smaller speedup. My modification is about 2.6 faster than the current development version (on an Intel/Linux system). Similarly, in the current development version, x*x is still about 2.6 times faster than x^2. Furthermore, the modification in the current development version slows down exponentiation by 0.5 (square roots) by about 4%. I think there's no reason not to just use my patch. One could also put in a similar modification to speed up squaring of integer vectors, but I think that is a much less common operation than squaring of real vectors, which arises all the time when computing squared residuals, squared Euclidean distances, etc. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Fourteen patches to speed up R
Luke - Thanks for your comments on the speed patches I submitted. I'm glad you like patch-transpose, patch-for, patch-evalList, and patch-vec-arith. I'll be interested to hear what you or other people think about patch-dollar and patch-vec-subset after you've had more time to consider them. (Recall I later posted a split of patch-vecsubset into patch-vec-subset and a new patch-subscript, fixing in patch-subscript a bug in my original combined patch.) Regarding patch-square and patch-sum-prod, you make different changes that address the largest inefficiencies, but I'd like to do some runs comparing your version with mine to see how big the remaining differences are. This is complicated by the fact that your changes to sum and prod seem to encounter a bug in the C compiler I'm using (gcc 4.2.4 for i486-linux-gnu) at optimization level -O2 (the default for the R configuration), the effect of which is for sum to deliver the right answer, but just as slowly as before. This doesn't happen with -O3. I'll investigate this further and report the conclusion. Similarly, I'll do some more timing tests regarding patch-protect, patch-fast-base, patch-fast-spec, and patch-save-alloc, and then comment further on the gains that they produce. Regarding patch-matprod, the issue of what BLAS routines do with NaN and NA seems like it's one that needs to be resolved, preferably in a way that doesn't slow down vector dot products by a factor of six. However, I don't know what actual problem reports motivated the current costly check for NAs. This all interacts with the extreme slowness on some machines of arithmetic on LDOUBLEs, which also seems like it needs some resolution. It's not clear to me what the expectations regarding accuracy of functions like sum should be. One could certainly argue that users would expect the same accuracy as adding them up with "+", and no huge slowdown from trying to get better accuracy. But maybe there's some history here, or packages that depend on increased accuracy (though of course there's no guarantee that a C "long double" will actually be bigger than a "double".) Regarding patch-parens, I don't understand your reluctance to incorporate this two-line code change. According to my timing tests (medians of five runs), it speeds up the test-parens.r script by 4.5%. (Recall this is "for (i in 1:n) d <- (a+1)/(a*(b+c))".) This is not a huge amount, but of course the speedup for the actual parenthesis operator is greater, since there is other overhead in this test. It would be even better to make all BUILTIN operators faster (which my patch-evalList does), but there are limits to how much is possible. The fact that "(" is conceptually a BUILTIN rather than a SPECIAL doesn't seem relevant to me. "{" is also conceptually a BUILTIN. Both are "special" from the point of view of the users, few of whom will even imagine that one could call "(" and "{" as ordinary functions. Even if they can imagine this, it makes no difference, since the patch has no user-visible effects. If one is worried that someone reading the code in src/main/eval.c might become confused about the semantics of parentheses, a two-line comment saying "Parens are conceptually a BUILTIN but are implemented as a SPECIAL to bypass the overhead of creating an evaluated argument list" would avoid any problem. As an analogy, consider a C compiler generating code for "2*x". One could argue that multiplication is conceptually repeated addition, so converting this to "x+x" is OK, but converting it to "x<<1" would be wrong. But I think no one would argue this. Rather, everyone would expect the compiler to choose between "x+x" and "x<<1" purely on the basis of which one is faster. Radford __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] History of seq and seq.int
I wonder what is the history of "seq" and "seq.int"? >From "help(seq)", one reads that "'seq.int' is an internal generic which can be much faster but has a few restrictions". And indeed, "seq.int(1,99,by=2)" is over 40 times faster than "seq(1,99,by=2)" in a quick test I just did. This is not surprising given that "seq" is written in R whereas "seq.int" is a primitive written in C. The only documented restriction on "seq.int" that I can see is that it won't produce complex-valued sequences. It would be nice if the 40-times-faster version could replace the slow version, if it can be made to do the same thing. But is there some reason I don't see that this makes this hard? Was the C version the original, and then someone decided that complex number support was essential, and most easily obtained by translating it to R? (The R code and C code are very parallel, so there was certainly a translation rather a rewrite one way or the other.) Or was the original in R, and someone rewrote it in C for speed, but stopped just short of being able to replace the R version with the C version because they didn't implement complex-valued sequences? Or is there some other feature that's missing from seq.int? By the way, I've submitted a bug report for seq.int, about it evaluating all it's arguments twice, but that isn't really related to the efficiency issue of seq versus seq.int. It is perhaps related to some strange code (and comments) in seq.int regarding special treatment for the along.with argument, which maybe also has some unusual history that isn't obvious... __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] make error for R 2.13.0 (and 2.12.0)
Regarding Tengfei Yin's post about an error trying to install "cluster" in 2.13.0, I have gotten an error with this package when trying to install the released version of 2.12.0. Here is the output on an Ubuntu Linux system: begin installing recommended package cluster * installing *source* package 'cluster' ... ** libs make[3]: Entering directory `/tmp/RtmpWX7ecF/R.INSTALL2aaf76ad/cluster/src' gcc -std=gnu99 -I/u/radford/R/R-2.12.0-rel2/include -I/usr/local/include -fpic -g -O2 -c clara.c -o clara.o g77 -fpic -g -O2 -c daisy.f -o daisy.o g77 -fpic -g -O2 -c dysta.f -o dysta.o gcc -std=gnu99 -I/u/radford/R/R-2.12.0-rel2/include -I/usr/local/include -fpic -g -O2 -c fanny.c -o fanny.o gcc -std=gnu99 -I/u/radford/R/R-2.12.0-rel2/include -I/usr/local/include -fpic -g -O2 -c init.c -o init.o g77 -fpic -g -O2 -c meet.f -o meet.o g77 -fpic -g -O2 -c mona.f -o mona.o gcc -std=gnu99 -I/u/radford/R/R-2.12.0-rel2/include -I/usr/local/include -fpic -g -O2 -c pam.c -o pam.o gcc -std=gnu99 -I/u/radford/R/R-2.12.0-rel2/include -I/usr/local/include -fpic -g -O2 -c sildist.c -o sildist.o gcc -std=gnu99 -I/u/radford/R/R-2.12.0-rel2/include -I/usr/local/include -fpic -g -O2 -c spannel.c -o spannel.o g77 -fpic -g -O2 -c twins.f -o twins.o gcc -std=gnu99 -shared -L/usr/local/lib -o cluster.so clara.o daisy.o dysta.o fanny.o init.o meet.o mona.o pam.o sildist.o spannel.o twins.o -L/usr/lib/gcc/i486-linux-gnu/3.4.6 -lg2c -lm make[3]: Leaving directory `/tmp/RtmpWX7ecF/R.INSTALL2aaf76ad/cluster/src' installing to /h/253/radford/R/R-2.12.0-rel2/library/cluster/libs ** R ** data ** moving datasets to lazyload DB ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/h/253/radford/R/R-2.12.0-rel2/library/cluster/libs/cluster.so': /h/253/radford/R/R-2.12.0-rel2/library/cluster/libs/cluster.so: undefined symbol: cl_daisy_ ERROR: loading failed * removing '/h/253/radford/R/R-2.12.0-rel2/library/cluster' make[2]: *** [cluster.ts] Error 1 make[2]: Leaving directory `/h/253/radford/R/R-2.12.0-rel2/src/library/Recommended' make[1]: *** [recommended-packages] Error 2 make[1]: Leaving directory `/h/253/radford/R/R-2.12.0-rel2/src/library/Recommended' make: *** [stamp-recommended] Error 2 Here is the output of gcc -v for this system: Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --enable-targets=all --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu4) This R-2.12.0 installation generally works (based on testing a few trivial things). The same error occurs when trying to install 2.12.0 on Intel Solaris. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Surprising behavior of letters[c(NA, NA)]
Duncan Murdoch writes: The relevant quote is in the Language Definition, talking about indices by type of index: "Logical. The indexing i should generally have the same length as x. If it is shorter, then its elements will be recycled as discussed in Section 3.3 [Elementary arithmetic operations], page 14. If it is longer, then x is conceptually extended with NAs. The selected values of x are those for which i is TRUE." But this certainly does not justify the actual behaviour. It says that, for example, (1:3)[NA] should not be a vector of three NAs, but rather a vector of length zero - since NONE of the indexes are TRUE. The actual behaviour of NA in a logical index makes no sense. It makes sense that NA in an integer index produces an NA in the result, since this NA might correctly express the uncertainty in the value at this position that follows from the uncertainty in the index (and hence produce sensible results in subsequent operations). But NA in a logical index should lead to a result that is of uncertain length. However, R has no mechanism for expressing such uncertainty, so it makes more sense that NA in a logical index should produce an error. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Patches to speed up R 2.13.0
I have created a collection of patches to the R 2.13.0 source that speed up many operations substantially. Some of these are related to the patches I posted in September 2010, though often with additional improvements. Other patches are new. The patches, along with detailed documentation, are available from http://www.cs.utoronto.ca/~radford/R-mods-2.13.0.html For some discussion of these patches and their benefits, see my blog post, at http://radfordneal.wordpress.com/2011/06/09/new-patches-to-speed-up-r-2-13-0 I believe that these improvements to the R implementation should be of substantial benefit to the R community. The R core team is welcome to incorporate them into versions of R that they release in the future. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Best practices for writing R functions (really copying)
Gabriel Becker writes: AFAIK R does not automatically copy function arguments. R actually tries very hard to avoid copying while maintaining "pass by value" functionality. ... R only copies data when you modify an object, not when you simply pass it to a function. This is a bit misleading. R tries to avoid copying by maintaining a count of how many references there are to an object, so that x[i] <- 9 can be done without a copy if x is the only reference to the vector. However, it never decrements such counts. As a result, simply passing x to a function that accesses but does not change it will result in x being copied if x[i] is changed after that function returns. An exception is that this usually isn't the case if x is passed to a primitive function. But note that not all standard functions are technically "primitive". The end result is that it's rather difficult to tell when copying will be done. Try the following test, for example: cat("a: "); print(system.time( { A <- matrix(c(1.0,1.1),5,1000); 0 } )) cat("b: "); print(system.time( { A[1,1]<-7; 0 } )) cat("c: "); print(system.time( { B <- sqrt(A); 0 } )) cat("d: "); print(system.time( { A[1,1]<-7; 0 } )) cat("e: "); print(system.time( { B <- t(A); 0 } )) cat("f: "); print(system.time( { A[1,1]<-7; 0 } )) cat("g: "); print(system.time( { A[1,1]<-7; 0 } )) You'll find that the time printed after b:, d:, and g: is near zero, but that there is non-negligible time for f:. This is because sqrt is primitive but t is not, so the modification to A after the call t(A) requires that a copy be made. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Latent flaw in SEXPREC definition
There seems to be a latent flaw in the definition of struct SEXPREC in Rinternals.h, which likely doesn't cause problems now, but could if the relative sizes of data types changes. The SEXPREC structure contains a union that includes a primsxp, symsxp, etc, but not a vecsxp. However, in allocVector in memory.c, zero-length vectors are allocated using allocSExpNonCons, which appears to allocates a SEXPREC structure. This won't work if a vecsxp is larger than the other types that are in the union in the SEXPREC structure. Simply adding a vecsxp to the union would seem to fix this, as in the following patch: Index: src/include/Rinternals.h === --- src/include/Rinternals.h(revision 56640) +++ src/include/Rinternals.h(working copy) @@ -219,6 +219,7 @@ typedef struct SEXPREC { SEXPREC_HEADER; union { + struct vecsxp_struct vecsxp; struct primsxp_struct primsxp; struct symsxp_struct symsxp; struct listsxp_struct listsxp; __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Latent flaw in SEXPREC definition
> But the whole point of separating VECTOR_SEXPREC from the other > SEXPRECs is that they are _shorter_. A vecsxp is only going to be > larger than (say) an envsxp if 2 R_len_t's are more than 3 pointers, > which is quite unlikely since R_len_t variables holds things that > one might add to pointers. Unlikely, yes, but it's not inconceivable that someone might have a configuration where R_len_t is 64 bits (say, for some sort of compatibility reason) while pointers are 32 bits (say, for speed). > NOT having vecsxp in the SEXPREC union prevents programmers from > mistakingly using SEXP* where VECSXP* should have been used. Since > the distinction wasn't always there, I suspect that flagging usage > of x->u.vecsxp.length by the compiler was rather important at some > point in time. If that's an issue, one can just call the field something other than vecsxp. Of course, an alternative solution is to not use allocSExpNonCons to allocate zero-length vectors. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Improved version of Rprofmem
The Rprofmem facility is currently enabled only if the configuration option --enable-memory-profiling is used. However, the overhead of having it enabled is negligible when profiling is not actually being done, and can easily be made even smaller. So I think it ought to be enabled all the time. I've attached a patch doing this, which also makes a number of other improvements to Rprofmem, which are upward-compatible with the current version. First, it allows for the profiling reports to be printed to the terminal (with Rprintf) as well as or instead of going to a file. This is not only a convenience, but also provides more information when these reports are interspersed with other output, allowing the source of the memory allocations to be better determined. Second, it gives the option for the alloction reports to include the type of the vector and its length, not just the number of bytes allocated. Third, it allows for all vector allocations to be reported, not just those that are large enough to be done with malloc (this distinction doesn't seem important for most uses). It also allows for reports to be produced only when the vector allocated has at least some number of elements, rather than only providing a threshold based on number of bytes. This seems more natural if a user knows that they are dealing with some vectors of length 10 and wants to see only those (or bigger ones). Also, if either the option for terminal output or for type and length details is used, allocation reports always end with a newline. For some reason, the present code doesn't write a newline if the call stack happens to be empty. (Though not documented, this is clearly deliberate, not a bug.) Though I can't see why one would want this, it is retained as the default behaviour for backward compatibility. Finally, I changed the printf format for values that are cast with (unsigned long) to %lu, rather than %ld, which I think is not correct. I think incorporating this in the upcoming 2.14.0 release would be useful. For instance, the following gives some useful insights into what R is doing with memory allocation: > Rprofmem("",terminal=TRUE,pages=FALSE,details=TRUE,nelem=1) > f <- function (x) + { cat("Now in f\n"); + s <- sum(x); + cat("sum =",s,"\n"); + x[10] <- 1; + s <- sum(x); + cat("sum =",s,"\n") + x[20] <- 1; + s <- sum(x); + cat("sum =",s,"\n") + y <<- x + NULL + } > f(rep(2,1)) Now in f RPROFMEM: 40040 (integer 1):"f" RPROFMEM: 40040 (integer 1):"f" RPROFMEM: 80040 (double 1):"f" sum = 2 RPROFMEM: 80040 (double 1):"f" sum = 1 sum = 19998 RPROFMEM: 80040 (double 1):"f" NULL > y[1] <- 0 > Rprofmem("") You can see the details of my modifications from the output of help(Rprofmem) after applying the patch I have attached, which I've put below: Rprofmem package:utils R Documentation Enable Profiling of R's Memory Use Description: Enable or disable reporting of memory allocation in R. Usage: Rprofmem(filename = "Rprofmem.out", append = FALSE, threshold = 0, nelem = 0, terminal = FALSE, pages = TRUE, details = FALSE) Arguments: filename: The file to which reports of memory allocations are written, or 'NULL' or '""' if reports should not go to a file. append: logical: should the file be appended to rather than overwritten? threshold: numeric: only allocations of vectors with size larger than this number of bytes will be reported. nelem: numeric: only allocations of vectors with at least this many elements will be reported. terminal: logical: should reports be printed on the terminal (as well as possibly written to 'filename')? pages: logical: should allocation of pages for small vectors be reported, and reporting of individual small vector allocations suppressed? details: logical: should details of allocation be reported, rather than only the total number of bytes? Details: The profiler tracks memory allocations, some of which will be to previously used memory and will not increase the total memory use of R. Calling 'Rprofmem' with either 'terminal=TRUE' or with 'filename' something other than 'NULL' or '""' (or both) will enable profiling, with allocation reports going to one or both places. Reports to the terminal are preceded by "RPROFMEM:". Enabling profiling automatically disables any existing profiling to another or the same file or to the terminal. Calling 'Rprofmem' with 'terminal=FALSE' (the default) and 'filename' either 'NULL' or '""' will disable profiling. If 'pages=TRUE' (the default) allocations of individual vectors will be reported only if they are "l
[Rd] Lack of protection bug in current R release candidate
The current R release candidate has a lack of protect bug (of very long standing) with respect to the R_print.na_string and R_print.na_string_noquote fields of the static R_print structure declared in Print.h. This shows up very occassionally as incorrect output from the following lines in reg-tests-2.R: x <- c("a", NA, "b") factor(x) factor(x, exclude="") The solution (kludgy, but the whole concept is kludgy) is to forward R_print.na_string and R_print.na_string_noquote with the other "roots" in RunGenCollect (after the comment /* forward all roots */). Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Lack of protection bug in current R release candidate
> > The current R release candidate has a lack of protect bug > > (of very long standing) > > [ but not really reported, right ? ] It's "long standing" because it exists in versions of R going back many years. > but the R 3.2.1 release candidate probably really cannot be > touched now, with something non-trivial like this. > > We'd be *very* happy to get such problems during alpha or beta > testing phase (or even before). I'm not sure what you mean to imply here, but for your information, I reported the bug to r-devel within about an hour of finding what caused it. (I'd noticed the symptoms a few days before, but hadn't isolated the cause.) Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Add-on argument in sample()
> Then the question would be if this test could be replaced with a new > argument to sample, e.g. expandSingle, which has TRUE as default for > backward compatibility, but FALSE if you dont want population to be > expanded to 1:population. It could certainly be useful in some cases, > but you still need to know about the expansion to use it. I think most > of these bugs occur because users did not think about the expansion in > the first place or did not realize that their population could be of > length 1 in some situations. These users would therefore not think about > changing the argument either. I think the solution might be to make sample always treat the first argument as the vector to sample from if it has a "dim" attribute that explicitly specifies that it is a one-dimensional array. The effect of this would be that sample(10,1) would sample from 1:10, as at present, but sample(array(10),1) would sample from the length-one vector with element 10 (and hence always return 10). With this change, you can easily ensure that sample(v,1) always samples from v even when it has length one by rewriting it to sample(array(v),1). It's of course possible that some existing code relies on the current behaviour, but probably not much existing code, since one-dimensional arrays are (I think) not very common at present. A bigger gain would come if one also introduced a new sequence operator that creates a sequence that is marked as a one-dimensional array, which would be part of a solution to several other problems as well, as I propose at http://www.cs.utoronto.ca/~radford/ftp/R-lang-ext.pdf Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Improving string concatenation
Gabor Csardi writes: > Btw. for some motivation, here is a (surely incomplete) list of > languages with '+' as the string concatenation operator: > > ALGOL 68, BASIC, C++, C#, Cobra, Pascal, Object Pascal, Eiffel, Go, > JavaScript, Java, Python, Turing, Ruby, Windows Powers hell, > Objective-C, F#, Sc-ala, Ya. The situation for R is rather different from that of a language (as many of the above) in which variables are declared to be of a specific type. In such a statically typed language, when you see the expression "a+b", it is easy to figure out whether the "+" will be numeric addition or string concatenation, by looking at the declarations of "a" and "b". But in a language such as R in which values have types, but variables don't, someone seeing "a+b" in code wouldn't be able to tell easily what it does. This is OK, in fact desirable, in the case of operator dispatch according to class when the different methods implement versions of the operator that have analogous properties. But numeric addition and string concatenation have just about nothing in common, so cases where functions are meant to have "+" be either addition OR concatenation are going to be rare. Furthermore, making "+" concatenate strings would preclude ever making "+" convert strings to numbers (signalling an error if they aren't in some numerical format) and then add them. I'm not sure whether that would be a good idea or not, but it might be unwise to eliminate the possibility. And of course, as someone else mentioned, it may instead be desirable for attempts to add strings to signal an error, as at present, which one also gives up by making "+" do concatenation. > Yes, even Fortran has one, and in C, I can simply write "literal1" > "literal2" and they'll be concatenated. It is only for literals, but > still very useful. Concatenation of literal strings could easily be added to the R parser without changing anything else. (Getting them to deparse as the same two pieces would be tricky, but is maybe not necessary.) Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Add-on argument in sample()
William Dunlap writes: > I don't like the idea of having a length-1 dim attribute trigger some > behavior of sample. (Should a length-2 dim cause it to sample > rows of a matrix, as unique() and duplicated() do?). I wasn't too clear. I would want any dim attribute on its first argument to cause sample to treat it as the vector of values to sample from, rather than as the upper limit for the set of integers to sample from. This is consistent with how higher-dimensional arrays can be treated as vectors in some other contexts. But I'd expect the 1D case to be the common case. The basic problem, here and in other places, is that R (and S) don't distinguish a scalar from a vector of length one. I think this was a mistake, which one can try to unwind now by at least not treating a vector of length one as a scalar if it is explicitly marked as a vector by a dim attribute. Allowing a dim attribute of length zero (presently disallowed) might also be a good idea, because it could be used to explicitly mark a single number as a scalar, not a vector of length one. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Two bugs showing up mostly on SPARC systems
In testing pqR on Solaris SPARC systems, I have found two bugs that are also present in recent R Core versions. You can see the bugs and fixes at the following URLs: https://github.com/radfordneal/pqR/commit/739a4960a4d8f3a3b20cfc311518369576689f37 https://github.com/radfordneal/pqR/commit/339b7286c7b43dcc6b00e51515772f1d7dce7858 The first bug, in nls, is most likely to occur on a 64-bit big-endian system, but will occur with low probability on most platforms. The second bug, in readBin, may occur on systems in which unaligned access to data more than one byte in size is an error, depending on details of the compiler. It showed up with gcc 4.9.2 on a SPARC system. The fix slightly changes the error behaviuor, signaling an error on inappropriate reads before reading any data, rather than after reading one (but not all) items as before. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Two bugs showing up mostly on SPARC systems
On Tue, Jul 14, 2015 at 07:52:56PM -0400, Duncan Murdoch wrote: > On 14/07/2015 6:08 PM, Radford Neal wrote: > > In testing pqR on Solaris SPARC systems, I have found two bugs that > > are also present in recent R Core versions. You can see the bugs and > > fixes at the following URLs: > > > > > > https://github.com/radfordneal/pqR/commit/739a4960a4d8f3a3b20cfc311518369576689f37 > > Thanks for the report. Just one followup on this one: > > There are two sections of code that are really similar. Your patch > applies to the code in port_nlsb(), but there's a very similar test in > port_nlminb(), which is called from nlminb() in R. Do you think it > would be a good idea to apply the same patch there as well? It doesn't > look like it would hurt, but I don't know this code at all, so it might > be unnecessary. Looking at nlminb, it seems that this bug doesn't exist there. The R code sets low <- upp <- NULL, as in nls, but later there is an "else low <- upp <- numeric()" that ensures that low and upp are never actually NULL. This may have been a fix for the bug showing up in nlminb that was not applied to nls as well (and of course, the fix didn't delete the now pointless low <- upp <- NULL). The nlminb code might be a better fix, stylistically, after removing "low <- upp <- NULL", though it seems that both it and my fix for nls should work. Of course, both assume that the call of the C function is done only from this R code, so no other types for low and upp are possible. And really the whole thing ought to be rewritten, since the .Call functions modify variables without any regard for whether or not their values are shared. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 149, Issue 22
> From: Joshua Bradley > > I have been having issues using parallel::mclapply in a memory-efficient > way and would like some guidance. I am using a 40 core machine with 96 GB > of RAM. I've tried to run mclapply with 20, 30, and 40 mc.cores and it has > practically brought the machine to a standstill each time to the point > where I do a hard reset. When mclapply forks to start a new process, the memory is initially shared with the parent process. However, a memory page has to be copied whenever either process writes to it. Unfortunately, R's garbage collector writes to each object to mark and unmark it whenever a full garbage collection is done, so it's quite possible that every R object will be duplicated in each process, even though many of them are not actually changed (from the point of view of the R programs). One thing on my near-term to-do list for pqR is to re-implement R's garbage collector in a way that will avoid this (as well as having various other advantages, including less memory overhead per object). Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] OpenMP problem with 64-bit Rtools
I've been getting pqR to work on windows systems, and in the process have discovered various problems with R core versions of R and with Rtools. One is with the implementation of OpenMP in 64-bit Rtools. This problem is in Rtools215 and Rtools33, and presumably all the ones in between. You can see the problem with the following test program: #include #include static struct { omp_lock_t lock; char pad[20]; } s; void show_pad(void) { int i; for (i = 0; i<20; i++) printf(" %02x",s.pad[i]); printf("\n"); } int main(void) { int i; printf("size: %d\n",(int)sizeof s); for (i = 0; i<20; i++) s.pad[i] = 7; show_pad(); omp_init_lock (&s.lock); show_pad(); omp_set_lock (&s.lock); show_pad(); return 0; } When compiled using the Rtools compiler with "gcc -m32 -fopenmp", it works fine, printing three lines of output with all numbers being 7. But when compiled with "gcc -m64 -fopenmp", the last two lines have four zeros at the beginning. What's happended is that omp_init_lock has written zeros to four bytes after the end of the omp_lock_t structure. The reason for this appears to be that the omp.h file included is appropriate only for the 32-bit version of the openMP library. As far as I can see, there is no 64-bit version of this include file in the stuff installed with Rtools. This problem obviously has the potential to affect many packages that use OpenMP, though simple OpenMP applications may not do anything for which the incorrect omp.h include file makes a difference. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Problem with psignal.c for Windows builds
Continuing with problems that I've uncovered while getting pqR to work on Windows... The file src/ghuwin32/psignal.c (used only in Windows builds) fails to use the sigset_t type in all places where it should, using "int" some places instead. Here is a diff of the needed corrections: @@ -253,7 +253,7 @@ sighandler_t signal(int signal_Number, sighandler_t signal_Handler) int sigaddset(sigset_t* sigset_Info,int signal_Number) { if (IS_SIGNAL(signal_Number)) { - (*sigset_Info) |= (1 << (signal_Number - 1)); + (*sigset_Info) |= ((sigset_t)1 << (signal_Number - 1)); return 0; } else { @@ -267,7 +267,7 @@ int sigaddset(sigset_t* sigset_Info,int signal_Number) int sigdelset(sigset_t* sigset_Info,int signal_Number) { if (IS_SIGNAL(signal_Number)) { - *sigset_Info &= ~(1<< (signal_Number - 1)); + *sigset_Info &= ~((sigset_t)1<< (signal_Number-1)); return 0; } else { @@ -295,7 +295,7 @@ int sigfillset(sigset_t* sigset_Info) int sigismember(sigset_t* sigset_Info,int signal_Number) { if (IS_SIGNAL(signal_Number)) { - if ( *sigset_Info & (1 << (signal_Number-1))) + if ( *sigset_Info & ((sigset_t)1 << (signal_Number-1))) return 1; else return 0; @@ -380,9 +380,9 @@ int sigprocmask(int mask_Function,sigset_t* sigset_Info, } /* Set signal mask = */ -int sigsetmask(int signal_MaskNew) +sigset_t sigsetmask(sigset_t signal_MaskNew) { -int signal_MaskOld = downhill_Sigset_Mask; +sigset_t signal_MaskOld = downhill_Sigset_Mask; if (sigprocmask(SIG_SETMASK, &signal_MaskNew, NULL) == -1) return (int)-1; @@ -391,7 +391,7 @@ int sigsetmask(int signal_MaskNew) } /* Add signals to mask = */ -int sigblock(int signal_MaskNew) +sigset_t sigblock(sigset_t signal_MaskNew) { /* Block a specific group of signals */ return sigsetmask(downhill_Sigset_Mask|signal_MaskNew); Now, you may wonder how it's been working with these errors. The file src/gnuwin32/fixed/h/psignal.h has the following code: /* mingw-w64's sys/types.h also defines this and we want this defn */ #ifndef _SIGSET_T_ #define _SIGSET_T_ typedef int sigset_t; #endif /* Not _SIGSET_T_ */ The comment is unclear as to whether the mingw definition is or is not the one that is desired (what does "this" refer to?). Anyway, the mingw sys/types.h file contains the following code: #ifndef _SIGSET_T_ #define _SIGSET_T_ #ifdef _WIN64 __MINGW_EXTENSION typedef unsigned long long _sigset_t; #else typedef unsigned long _sigset_t; #endif #ifdef _POSIX typedef _sigset_t sigset_t; #endif #endif /* Not _SIGSET_T_ */ So if the R psignal.h file is included before sys/types, sigset_t will be defined as "int", not as "unsigned long" or "unsigned long long". And in fact it seems that R (though not pqR) always includes psignal.h before sys/types.h (if the latter is included as well). There's even this code snippet in src/unix/Rscript.c: #ifdef WIN32 #include /* on some systems needs to be included before */ #endif And indeed, with Rtools, you do have to include psignal.h first, since the mingw sys/types.h defines _SIGSET_T_, preventing psignal.h from defining sigset_t but does not define sigset_t itself (rather only _sigset_t), since _POSIX isn't defined. So current R Core releases (accidentally?) avoid the problem by not actually using mingw's sigset_t type, but instead always using int. But it's possible that there is some reason to want to use the type defined in sys/types.h. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Getting SSE2 instructions to work in 32-bit builds on Windows
When getting pqR to work on Windows, I've wanted for it to be able to use SSE2 instructions with 32-bit builds, for those 32-bit processors that have SS2 instructions (all of them from the Pentium 4 onwards). It seems that R Core 32-bit versions do not attempt this, instead using the 387 FPU for all floating-point arithmetic. This is sometimes slower than using SSE2 instructions, and also produces results that are not compliant with the IEEE floating point standard, and that are not reproducible - possibly changing after trivial, unrelated changes to R or to the C compiler used. Once can get the gcc used in Rtools to use SSE2 instructions by including the following compiler options: -m32 -msse2 -mfpmath=sse Unfortunately, the result is that some things then crash. The problem is that by default gcc assumes that the stack is aligned to a 16-byte boundary on entry to a procedure, which allows it to easily ensure the 16-byte alignment needed for SSE2 instructions. Unfortunately, Windows does not ensure that a 32-bit application's stack is 16-byte aligned, so this doesn't work. (There's no problem for 64-bit builds, however.) A solution is to add one more option: -m32 -msse2 -mfpmath=sse -mstackrealign The -mstackrealign option forces gcc to generate code to align the stack on procedure entry, rather than assuming it is already aligned. It would probably be enough to compile only a few modules with this option (ones that are directly called from outside R), and hence avoid most of the extra procedure call overhead, but I haven't attempted this. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Problems with embedded R, ReplDLL
Along with getting pqR to work on Windows, I've also been testing it in the context of embedded R, and in the process have found some problems with the examples given of embedded R use. One problem can be seen in R_ReplDLLinit, in src/main/main.c: void R_ReplDLLinit(void) { SETJMP(R_Toplevel.cjmpbuf); R_GlobalContext = R_ToplevelContext = R_SessionContext = &R_Toplevel; R_IoBufferWriteReset(&R_ConsoleIob); prompt_type = 1; DLLbuf[0] = DLLbuf[CONSOLE_BUFFER_SIZE] = '\0'; DLLbufp = DLLbuf; } The call of SETJMP makes no sense. Nothing that follows in this function can possibly cause a long jump. The use of R_ReplDLLinit is illustrated in the R Extensions manual (doc/manual/R-exts.texi): R_ReplDLLinit(); while(R_ReplDLLdo1() > 0) { /* add user actions here if desired */ } Note that the R_ReplDLLdo1 function does not call SETJMP. Amazingly, however, the combination sometimes seems to work! When a long jump is taken during evaluation of an expression initiated by R_ReplDLLdo1, the stack is restored to the value suitable for the now-non-existent stack frame for R_ReplDLLinit, and execution of the statements in R_ReplDLLinit then continues after the SETJMP. Likely, none of these statements actually refers to the non-existent stack frame (seeing as this function has no parameters and no local variables), so nothing disastrous occurs, and in the end the procedure return code at the end of R_ReplDLLinit has the effect of doing a return from R_ReplDLLdo1, which is more-or-less the desired result. Obviously, one does not want a program to "work" for such reasons, which may cease to apply with any change to how the C compiler generates code. The current development version of pqR has a different version of R_ReplDLLdo1, which calls SETJMP itself, but this version is tied to other rationalizations of the Repl functions in the development version of pqR. The embedded R examples and tests have various other problems that are fixed in the current development branch of pqR (branch 44 at the moment). R Core might want to take a look there, or at the next version of pqR, which will be released soon. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] OpenMP problem with 64-bit Rtools
> On Fri, Aug 21, 2015 at 8:38 PM, Radford Neal wrote: > > > The reason for this appears to be that the omp.h file included is > > appropriate only for the 32-bit version of the openMP library. As > > far as I can see, there is no 64-bit version of this include file > > in the stuff installed with Rtools. On Sat, Aug 22, 2015 at 01:30:51AM +0200, Jeroen Ooms wrote: > This particular behavior exists in all builds of mingw-w64, not rtools > per se. We might get a better answer in the mingw-w64-public mailing > list. A comment near the front of omp.h says "These two structures get edited by the libgomp build process to reflect the shape of the two types". So presumably it is possible to generate the 64-bit version, and presumably this must have been done in order to compile the 64-bit version of the OpenMP library that is supplied with Rtools. Maybe there's no provision in the build process for having two omp.h files at once, though... Radford __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] OpenMP on Windows reset FPU mode, disabling long double
Another comment resulting from my testing of pqR on Windows... With the Rtools implementation of OpenMP, threads started with an OpenMP "parallel" construct have the state of the FPU reset so that "long double" arithmetic is actually done only to "double" precision. The problem can be seen with the following program: #include #include long double a, b; double d; int main(void) { a = 1.0; b = a+1e-17; d = b-1.0; printf("M: %.1e\n",d); __asm__("fninit"); /* set for long double arithmetic */ a = 1.0; b = a+1e-17; d = b-1.0; printf("M: %.1e\n",d); # pragma omp parallel num_threads(3) { int t = omp_get_thread_num(); if (t==2) __asm__("fninit"); /* set for long double in thread 2 */ a = 1.0; b = a+1e-17; d = b-1.0; printf("%d: %.1e\n",t,d); } a = 1.0; b = a+1e-17; d = b-1.0; printf("M: %.1e\n",d); return 0; } At least on a 32-bit Windows 7 system, the output of "d" is 1e-17 for all threads except thread 1, for which it is 0. That's the new thread for which the __asm__("fninit") wasn't done to reset to long double precision, and for which adding 1e-17 to 1.0 therefore has no effect. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Problem with psignal.c for Windows builds
One thing I forgot in my previous message about problems with psignal.c on Rtools for Windows... One also needs to change src/gnuwin32/fixed/h/psignal.h At a minimum, one needs the following changes: @@ -122,8 +129,8 @@ typedef struct /* Prototype stuff ***/ -int sigsetmask(int signal_Block_MaskNew); -int sigblock(int signal_Block_MaskNew); +sigset_t sigsetmask(sigset_t signal_Block_MaskNew); +sigset_t sigblock(sigset_t signal_Block_MaskNew); int sighold(int signal_Number); int sigrelse(int signal_Number); int sigaction(int signal_Number,struct sigaction* sigaction_Info, @@ -143,7 +150,7 @@ int sigsuspend(sigset_t* sigset_Info); /* Re-mapped functions = */ -#define sigmask(signal_Index) (1<<(signal_Index-1)) +#define sigmask(signal_Index) ((sigset_t)1<<(signal_Index-1)) /* This must be a macro, since we want setjmp working in the But the definition of sigset_t may also need to change, as discussed in my previous message. You can see the pqR versions at the following URLs: https://github.com/radfordneal/pqR/blob/44/src/gnuwin32/psignal.c https://github.com/radfordneal/pqR/blob/44/src/gnuwin32/fixed/h/psignal.h __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] New version of the R parser in pqR
I have rewritten the R parser in the new version of pqR that I recently released (pqR-2015-09-14, at pqR-project.org). The new version of the parser is much cleaner, is faster (sometimes quite substantially faster), has a better interface to the read-eval-print loop, and provides a better basis for future extensions. The deparser has also been substantially revised in pqR, and is better coordinated with the parser. The new parser and deparser also fix a number of bugs, as can be seen in the NEWS file at pqR-project.org. I believe the new parser is almost completely compatible with R-3.2.2, with just a few slight differences in the result of getParseData, and a few deliberate changes, which could easily be undone if one really wanted to. It works with RStudio and with the Windows GUI (I haven't tested with the Mac GUI). Internally, there is some slight interaction with pqR's use of read-only constants for some frequent sub-expressions, but this also is easily removed if desired. The new parser operates top-down, by recursive descent, rather than using a bottom-up parser generated by Bison. This allows for much cleaner operation in several respects. Use of PROTECT can now follow the usual conventions, with no need to use UNPROTECT_PTR. Creation of parse data records is relatively straightforward. The special handling of newlines is also comparatively easy, without the need to embed a rudimentary top-down parser within the lexical analyser, as was done in the Bison parser. The old read-eval-print loop operates by calling the parser repeatedly with one line of input, two lines of input, etc. until it gets a return code other than PARSE_INCOMPLETE. This results in the time taken growing as the square of the number of source lines. The new read-eval-print loop simply provides the parser with a character input routine that reads new lines as required, while the parser avoids looking at a character until really needed, to avoid spurious requests for more input lines. A quadratic time growth relating to parse data is also avoided in the new parser. I suggest that R Core consider adopting this new parser in future R Core releases of R, along with the associated changes to the read-eval-print loop, and the revised version of the deparser. The new parser is better documented than the old parser. I am also willing to provide assistance to anyone trying to understand the code. I have tested the new parser on the 5018 packages in the pqR repository, but of course it's possible that problems might show up in some other CRAN packages. I'm willing to help in resolving any such problems as well. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] deparse with parentheses for SUBSET
> maybe there?s a reason for it, but the discrepancy between the > handling of `[` and `$` in deparsing seems odd to me: > > > substitute(a[1], list(a = quote(x * y))) > x * y[1] > > > substitute(a$b, list(a = quote(x * y))) > (x * y)$b > > The former is still executed in the right order (`*` first, then > `[`), which is not what you?d expect looking at the deparse result. This is one of a large number of bugs in the deparser and parser that are fixed in the latest version of pqR (of 2015-09-14), as a result of a complete rewrite of the parser and extensive revisions to the deparser. As I communicated to this list, it would be reasonably easy to incorporate the rewritten parser and deparser into R Core versions of R. You can get more details from http://www.pqr-project.org/NEWS.txt The portion of the bugs section in that NEWS file relevant to parsing and deparsing is shown below: o The new parser fixes bugs arising from the old parser's kludge to handle semicolons, illustrated by the incorrect output seen below: > p<-parse() ?"abc;xyz" Error in parse() : :1:1: unexpected INCOMPLETE_STRING 1: "abc; ^ > p<-parse() ?8 #abc;xyz Error in parse() : :1:7: unexpected end of input 1: 8 #abc; ^ o Fixed deparsing of complex numbers, which were always deparsed as the sum of a real and an imaginary part, even though the parser can only produce complex numbers that are pure imaginary. For example, the following output was produced before: > deparse(quote(3*5i)) [1] "3 * (0+5i)" This is now deparsed to "3 * 5i". This bug exists in all R Core versions through at least R-3.2.2. o Fixed a number of bugs in the deparser that are illustrated by the following, which produce incorrect output as noted, in R Core versions through at least R-3.2.2: deparse(parse(text="`+`(a,b)[1]")[[1]])# Omits necessary parens deparse(quote(`[<-`(x,1)),control="S_compatible") # unmatched " and ' deparse(parse(text="a = b <- c")[[1]]) # Puts in unnecessary parens deparse(parse(text="a+!b")[[1]]) # Puts in unnecessary parens deparse(parse(text="?lm")[[1]])# Doesn't know about ? operator deparse(parse(text="a:=b")[[1]]) # Doesn't know about := operator deparse(parse(text="a$'x'")[[1]]) # Conflates name and character deparse(parse(text="`*`(2)")[[1]]) # Result is syntactically invalid deparse(parse(text="`$`(a,b+2)")[[1]]) # Result is syntactically invalid e<-quote(if(x) X else Y); e[[3]]<-quote(if(T)3); deparse(e)# all here e <- quote(f(x)); e[[2]] <- quote((a=1))[[2]]; deparse(e) # and below e <- quote(f(Q=x)); e[[2]] <- quote((a=1))[[2]]; deparse(e)# need parens e <- quote(while(x) 1); e[[2]] <- quote((a=1))[[2]]; deparse(e) e <- quote(if(x) 1 else 2); e[[2]] <- quote((a=1))[[2]]; deparse(e) e <- quote(for(x in y) 1); e[[3]] <- quote((a=1))[[2]]; deparse(e) In addition, the bug illustrated below was fixed, which was fixed (differently) in R-3.0.0: a<-quote(f(1,2)); a[[1]]<-function(x,y)x+y; deparse(a) # Omits parens o Fixed the following bug (also in R Core versions to at least R-3.2.2): > parse() ?'\12a\x.' Error: '\x' used without hex digits in character string starting "'\1a\x" Note that the "2" has disappeared from the error message. This bug also affected the results of getParseData. o Fixed a memory leak that can be seen by running the code below: > long <- paste0 (c('"', rep("1234567890",820), '\x."'), collapse="") > for (i in 1:100) try (e <- parse(text=long), silent=TRUE) The leak will not occur if 820 is changed to 810 in the above. This bug also exists in R Core versions to at least R-3.2.2. o Entering a string constant containing Unicode escapes that was or 1 characters long would produce an error message saying "String is too long (max 1 chars)". This has been fixed so that the maximum now really is 1 characters. (Also present in R Core versions, to at least R-3.2.2.) o Fixed a bug that caused the error caret in syntax error reports to be misplaced when more than one line of context was shown. This was supposedly fixed in R-3.0.2, but incorrectly, resulting in the error caret being misplaced when only one line of context is shown (in R Core versions to at least R-3.2.2). __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays
Regarding Martin Maechler's proposal: Arithmetic between length-1 arrays and longer non-arrays had silently dropped the array attributes and recycled. This now gives a warning and will signal an error in the future, as it has always for logic and comparison operations For example, matrix(1,1,1) + (1:2) would give a warning/error. I think this might be a mistake. The potential benefits of this would be detection of some programming errors, and increased consistency. The downsides are breaking existing working programs, and decreased consistency. Regarding consistency, the overall R philosophy is that attaching an attribute to a vector doesn't change what you can do with it, or what the result is, except that the result (often) gets the attributes carried forward. By this logic, adding a 'dim' attribute shouldn't stop you from doing arithmetic (or comparisons) that you otherwise could. But maybe 'dim' attributes are special? Well, they are in some circumstances, and have to be when they are intended to change the behaviour, such as when a matrix is used as an index with [. But in many cases at present, 'dim' attributes DON'T stop you from treating the object as a plain vector - for example, one is allowed to do matrix(1:4,2,2)[3], and a<-numeric(10); a[2:5]<-matrix(1,2,2). So it may make more sense to move towards consistency in the permissive direction, rather than the restrictive direction. That would mean allowing matrix(1,1,1)<(1:2), and maybe also things like matrix(1,2,2)+(1:8). Obviously, a change that removes error conditions is much less likely to produce backwards-compatibility problems than a change that gives errors for previously-allowed operations. And I think there would be some significant problems. In addition to the 10-20+ packages that Martin expects to break, there could be quite a bit of user code that would no longer work - scripts for analysing data sets that used to work, but now don't with the latest version. There are reasons to expect such problems. Some operations such as vector dot products using %*% produce results that are always scalar, but are formed as 1x1 matrices. One can anticipate that many people have not been getting rid of the 'dim' attribute in such cases, when doing so hasn't been necessary in the past. Regarding the 0-length vector issue, I agree with other posters that after a<-numeric(0), is has to be allowable to write a<1. To not allow this would be highly destructive of code reliability. And for similar reason, after a<-c(), a<1 needs to be allowed, which means NULL<1 should be allowed (giving logical(0)), since c() is NULL. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays
> Radford Nea: > > So it may make more sense to move towards consistency in the > > permissive direction, rather than the restrictive direction. > > > That would mean allowing matrix(1,1,1) < (1:2), and maybe also things > > like matrix(1,2,2)+(1:8). > > Martin Maechler: > That is an interesting idea. Yes, in my view that would > definitely also have to allow the latter, by the above argument > of not treating the dim/dimnames attributes special. For > non-arrays length-1 is not treated much special apart from the > fact that length-1 can always be recycled (without warning). I think one could argue for allowing matrix(1,1,1)+(1:8) but not matrix(1,2,2)+(1:8). Length-1 vectors certainly are special in some circumstances, being R's only way of representing a scalar. For instance, if (c(T,F)) gives a warning. This really goes back to what I think may have been a basic mistake in the design of S, in deciding that everything is a vector, then halfway modifying this with dim attributes, but it's too late to totally undo that (though allowing a 0-length dim attribute to explicitly mark a length-1 vector as a scalar might help). > > And I think there would be some significant problems. In addition to > > the 10-20+ packages that Martin expects to break, there could be quite > > a bit of user code that would no longer work - scripts for analysing > > data sets that used to work, but now don't with the latest version. > > That's not true (at least for the cases above): They would give > a strong warning But isn't the intent to make it an error later? So I assume we're debating making it an error, not just a warning. (Though I'm generally opposed to such warnings anyway, unless they could somehow be restricted to come up only for interactive uses, not from deep in a program the user didn't write, making them totally mysterious...) > *and* the logic and relop versions of this, e.g., >matrix(TRUE,1) | c(TRUE,FALSE) ; matrix(1,1) > 1:2, > have always been an error; so nothing would break there. Yes, that wouldn't change the behaviour of old code, but if we're aiming for consistencey, it might make sense to get rid of that error, allowing code like sum(a%*%b Of course; that *was* the reason the very special treatment for arithmetic > length-1 arrays had been introduced. It is convenient. > > However, *some* of the conveniences in S (and hence R) functions > have been dangerous {and much more used, hence close to > impossible to abolish, e.g., sample(x) when x is numeric of length 1, There's a difference between these two. Giving an error when using a 1x1 matrix as a scalar may detect some programming bugs, but not giving an error doesn't introduce a bug. Whereas sample(2:n) behaving differently when n is 2 than when n is greater than 2 is itself a bug, that the programmer has to consciously avoid by being aware of the quirk. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays
> > But isn't the intent to make it an error later? So I assume we're > > debating making it an error, not just a warning. > > Yes, that's correct. > But if we have a longish deprecation period (i.e. where there's > only a warning) all important code should have been adapted > before it turns to an error That might be true for continuously-used code. But for old scripts for analysing some dataset that someone decides to run again five years from now, they will be reduced to trying to guess which version of R they ran under originally, or if they now want to use newer features, to doing a binary search for the most recent version of R for which they still work, or of course going through the script trying to find the problem. This wouldn't be a disaster, but I'm not seeing the magnitude of benefit that would justify imposing this burden on users. A language specification shouldn't really be changing all the time for no particular reason. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] accelerating matrix multiply
> From: "Cohn, Robert S" > > I am using R to multiply some large (30k x 30k double) matrices on a > 64 core machine (xeon phi). I added some timers to > src/main/array.c to see where the time is going. All of the time is > being spent in the matprod function, most of that time is spent in > dgemm. 15 seconds is in matprod in some code that is checking if > there are NaNs. > > The NaN checking code is not being vectorized... This can be a problem with big matrices when lots of cores are used for the actual multiply, but is even more of a problem when at least one of the matrices is small (eg, a vector-matrix multiply), in which case the NaN check can dominate, slowing the operation by up to a factor of about ten. I pointed this problem out over six years ago, and provided a patch that greatly speeds up many matrix multiplies (see http://www.cs.utoronto.ca/~radford/R-mods.html). But this improvement has not been incorporated into R Core versions of R. Since then, a more elaborate solution to the problem of NaN checks has been incorporated into my pqR version of R (see pqR-project.org). The documentation on this approach can be found with help("%*%") if you're running pqR, or you can just look at the source for this help file in the pqR source code repository, at https://github.com/radfordneal/pqR/blob/Release-2016-10-24/src/library/base/man/matmult.Rd Radford __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] length(unclass(x)) without unclass(x)?
> Henrik Bengtsson: > > I'm looking for a way to get the length of an object 'x' as given by > base data type without dispatching on class. The performance improvement you're looking for is implemented in the latest version of pqR (pqR-2016-10-24, see pqR-project.org), along with corresponding improvements in several other circumstances where unclass(x) does not create a copy of x. Here are some examples (starting with yours), using pqR's Rprofmemt function to get convenient traces of memory allocations: > Rprofmemt(nelem=1000) # trace allocations of vectors with >= 1000 elements > > x <- structure(double(1e6), class = c("foo", "numeric")) RPROFMEM: 840 (double 100):"double" "structure" RPROFMEM: 840 (double 100):"structure" > length.foo <- function(x) 1L > length(x) [1] 1 > length(unclass(x)) [1] 100 > > `+.foo` <- function (e1, e2) (unclass(e1) + unclass(e2)) %% 100 > z <- x + x RPROFMEM: 840 (double 100):"+.foo" > > `<.foo` <- function (e1, e2) any(unclass(e1) x > y <- unclass(x) RPROFMEM: 840 (double 100): There is no large allocation with length(unclass(x)), and only the obviously necessarily single allocation in +.foo (not two additional allocations for unclass(e1) and unclass(e1). For <.foo, there is no large allocation at all, because not only are allocations avoided for unclass(e1) and unclass(e2), but 'any' also avoids an allocation for the result of the comparison. Unfortunately, assigning unclass(x) to a variable does result in a copy being made (this might often be avoided in future). These performance improvements are implemented using pqR's "variant result" mechanism, which also allows many other optimizations. See https://radfordneal.wordpress.com/2013/06/30/how-pqr-makes-programs-faster-by-not-doing-things/ for some explanation. There is no particular reason this mechanism couldn't be incorporated into R Core's implementation of R. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: (in-principle) native unquoting for standard evaluation
Michael Lawrence (as last in long series of posters)... > Yes, it would bind the language object to the environment, like an > R-level promise (but "promise" of course refers specifically to just > _lazy_ evaluation). > > For the uqs() thing, expanding calls like that is somewhat orthogonal > to NSE. It would be nice in general to be able to write something like > mean(x, extra_args...) without resorting to do.call(mean, c(list(x), > extra_args)). If we had that then uqs() would just be the combination > of unquote and expansion, i.e., mean(x, @extra_args...). The "..." > postfix would not work since it's still a valid symbol name, but we > could come up with something. I've been trying to follow this proposal, though without tracking down all the tweets, etc. that are referenced. I suspect I'm not the only reader who isn't clear exactly what is being proposed. I think a detailed, self-contained proposal would be useful. One thing I'm not clear on is whether the proposal would add anything semantically beyond what the present "eval" and "substitute" functions can do fairly easily. If not, is there really any need for a slightly more concise syntax? Is it expected that the new syntax would be used lots by ordinary users, or is it only for the convenience of people who are writing fairly esoteric functions (which might then be used by many)? If the later, it seems undesirable to me. There is an opportunity cost to grabbing the presently-unused unary @ operator for this, in that it might otherwise be used for some other extension. For example, see the last five slides in my talk at http://www.cs.utoronto.ca/~radford/ftp/R-lang-ext.pdf for a different proposal for a new unary @ operator. I'm not necessarily advocating that particular use (my ideas in this respect are still undergoing revisions), but the overall point is that there may well be several good uses of a unary @ operator (and there aren't many other good characters to use for a unary operator besides @). It is unclear to me that the current proposal is the highest-value use of @. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals
> : > I don't think it is reasonable to change the parser this way. This is > currently valid R code: > > a <- "foo" > "bar" > > and with the new syntax, it is also valid, but with a different > meaning. Yes. The meaning of that would certainly need to stay the same. However, the following is currently invalid, but would mean what is desired if consecutive string literals were merged by the parser, as in C: a <- ("foo" "bar") In practice, most uses will be of something like cat ("I'd like to teach " "the world to sing" "\n") which unambigously means the right thing without further effort. So just doing like in C seems the way to go to me. Radford __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
> On Wed, 14 Jun 2017, G?bor Cs?rdi wrote: > > > I like the idea of string literals, but the C/C++ way clearly does not > > work. The Python/Julia way might, i.e.: > > > > """this is a > > multi-line > > lineral""" > > luke-tier...@uiowa.edu: > This does look like a promising option; some more careful checking > would be needed to make sure there aren't cases where currently > working code would be broken. I don't see how this proposal solves any problem of interest. String literals can already be as long as you like. The problem is that they will get wrapped around in an editor (or not all be visible), destroying the nice formatting of your program. With the proposed extension, you can write long string literals with short lines only if they were long only because they consisted of multiple lines. Getting a string literal that's 79 characters long with no newlines (a perfectly good error message, for example) to fit in your 80-character-wide editing window would still be impossible. Furthermore, these Python-style literals have to have their second and later lines start at the left edge, destroying the indentation of your program (supposing you actually wanted to use one). In contrast, C-style concatenation (by the parser) of consecutive string literals works just fine for what you'd want to do in a program. The only thing they wouldn't do that the Python-style literals would do is allow you to put big blocks of literal text in your program, without having to put quotes around each line. But shouldn't such text really be stored in a separate file that gets read, rather than in the program source? Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] attributes on symbols
Attributes on symbols seem like a bad idea to me. An additional obscure part of the global state seems undesirable. I can't see how any use of them would be preferrable to storing an environment in some global variable, in which the same information could be recorded. Note that the attributes on symbols are not saved when the workspace is saved with q("yes"). Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug: deparse sometimes omits parentheses for unary operators
> From: "Binder, Martin" > To: r-devel@r-project.org > Subject: [Rd] bug: deparse sometimes omits parentheses for unary > operators > > Example (bad): > > (expr = substitute(-a * 10, list(a = quote(if (TRUE) 1 else 0 > -if (TRUE) 1 else 0 * 10 > > The parentheses around the "if"-construct are missing. The expected > output (the true representation of the expression) is, in fact: > > -(if (TRUE) 1 else 0) * 10 It would also be OK for the result to be (-if (TRUE) 1 else 0) * 10 It's possible that producing this form is more natural for the deparser. There are a number of other bugs in the deparser, many of which I fixed in the pqR deparser when doing a major rewrite, though I missed the one you report above. You can see them at http://www.pqr-project.org/NEWS.txt Look for BUG FIXES under VERSION RELEASED 2015-09-14. Some are fixed in R-3.4.0, but most remain. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Please avoid direct use of NAMED and SET_NAMED macros
> To allow for future changes in the way the need for duplication is > detected in R internal C code, package C code should avoid direct > use of NAMED,and SET_NAMED, or assumptions on the maximal value > of NAMED. Use the macros MAYBE_REFERENCED, MAYBE_SHARED, and > MARK_NOT_MUTABLE instead. These currently correspond to > > MAYBE_REFERENCED(x): NAMED(x) > 0 > MAYBE_SHARED(x): NAMED(x) > 1 > MARK_NOT_MUTABLE(x): SET_NAMED(c, NAMEDMAX) > > Best, > > luke Checking https://cran.r-project.org/doc/manuals/r-release/R-exts.html shows that currently there is no mention of these macros in the documentation for package writers. Of course, the explanation of NAMED there also does not adequtely describe what it is supposed to mean, which may explain why it's often not used correctly. Before embarking on a major change to the C API, I'd suggest that you produce clear and complete documention on the new scheme. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Revert to R 3.2.x code of logicalSubscript in subscript.c?
; x <- numeric(N) > i <- rep(TRUE, length(x))# no reycling > system.time(for (r in 1:R) a <- x[i]) user system elapsed 0.156 0.036 0.192 > i <- TRUE# recycling > system.time(for (r in 1:R) a <- x[i]) user system elapsed 0.184 0.012 0.195 > > x <- numeric(N) > system.time(for (r in 1:R) a <- x[-1]) user system elapsed 0.060 0.012 0.075 > system.time(for (r in 1:R) a <- x[2:length(x)]) user system elapsed 0.052 0.024 0.075 > > v <- 2:length(x) > system.time(for (r in 1:R) a <- x[v]) user system elapsed 0.180 0.004 0.182 Summarizing elapsed times: LARGE VECTORS T1 T2 T3 T4 T5 T6 T7 R-3.4.2: 0.297 0.418 1.773 1.613 1.895 1.827 1.689 pqR dev: 0.193 0.434 0.988 1.105 0.435 0.436 0.998 SMALL VECTORS T1 T2 T3 T4 T5 T6 T7 R-3.4.2: 0.089 0.084 0.515 0.420 0.516 0.473 0.428 pqR dev: 0.038 0.087 0.192 0.195 0.075 0.075 0.182 As one can see, pqR is substantially faster for all except T2 (where it's about the same). The very large advantage of pqR on T5 and T6 is partly because pqR has special code for efficiently handling things like x[-1] and x[2:length(x)], so I added the x[v] test to see what performance is like when this special handling isn't invoked. There's no particular reason pqR's code for these operations couldn't be adapted for use in the R Core implementaton, though there are probably a few issues involving large vectors, and the special handling of x[2:length(x)] would require implementing pqR's internal "variant result" mechanism. pqR also has much faster code for some other subset and subset assignment operations. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] uniform sampling without replacement algorithm
r to hold result. */ r = allocVector(INTSXP,k); int *y = INTEGER(r); /* Do the sampling as described above. */ int i; for (i = 0; i < k; i++) { int j = 1 + (int) ((n-i) * unif_rand()); unsigned h; for (h = j & tblmask; ; h = (h+1) & tblmask) { if (tbl[h].pos == 0) { y[i] = j; break; } if (tbl[h].pos == j) { y[i] = tbl[h].val; break; } } unsigned h2; for (h2 = (n-i) & tblmask; ; h2 = (h2+1) & tblmask) { if (tbl[h2].pos == 0) { tbl[h].val = n-i; break; } if (tbl[h2].pos == n-i) { tbl[h].val = tbl[h2].val; break; } } tbl[h].pos = j; /* don't set until after search for entry n-i */ } VMAXSET(vmax); } return r; } This will be in a new version of pqR that will have many other performance improvements as well, which I expect to release in a few weeks. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Illegal Logical Values
> On Fri, 2017-10-20 at 14:01 +, brodie gaslam via R-devel wrote: > > I'm thinking of this passage: > > > > > Logical values are sent as 0 (FALSE), 1 (TRUE) or INT_MIN = > > > -2147483648 (NA, but only if NAOK is true), and the compiled code > > > should return one of these three values. (Non-zero values other > > > than INT_MIN are mapped to TRUE.) > > > > The parenthetical seems to suggest that something like 'LOGICAL(x)[0] > > = 2;' will be treated as TRUE, which it sometimes is, and sometimes > > isn't: > > From: Martyn Plummer > The title of Section 5.2 is "Interface functions .C and .Fortran" and > the text above refers to those interfaces. It explains how logical > vectors are mapped to C integer arrays on entry and back again on exit. > > This does work as advertised. Not always. As I reported on bugzilla three years ago (PR#15878), it only works if the logical argument does not have to be copied. The bug has been fixed in pqR since pqR-2014-09-30. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Extreme bunching of random values from runif with Mersenne-Twister seed
In the code below, you seem to be essentially using the random number generator to implement a hash function. This isn't a good idea. My impression is that pseudo-random number generation methods are generally evaluated by whether the sequence produced from any seed "appears" to be random. Informally, there may be some effort to make long sequences started with seeds 1, 2, 3, etc. appear unrelated, since that is a common use pattern when running a simulation several times to check on variability. But you are relying on the FIRST number from each sequence being apparently unrelated to the seed. I think few or none of the people designing pseudo-random number generators evaluate their methods by that criterion. There is, however, a large literature on hash functions, which is what you should look at. But if you want a quick fix, perhaps looking not at the first number in the sequence, but rather (say) the 10th, might be preferable. Radford Neal > > seeds = c(86548915L, 86551615L, 86566163L, 86577411L, 86584144L, > 86584272L, > + 86620568L, 86724613L, 86756002L, 86768593L, 86772411L, 86781516L, > + 86794389L, 86805854L, 86814600L, 86835092L, 86874179L, 86876466L, > + 86901193L, 86987847L, 86988080L) > > > > random_values = sapply(seeds, function(x) { > + set.seed(x) > + y = runif(1, 17, 26) > + return(y) > + }) > > > > summary(random_values) >Min. 1st Qu. MedianMean 3rd Qu.Max. > 25.13 25.36 25.66 25.58 25.83 25.94 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Faster sorting algorithm...
Those interested in faster sorting may want to look at the merge sort implemented in pqR (see pqR-project.org). It's often used as the default, because it is stable, and does different collations, while being faster than shell sort (except for small vectors). Here are examples, with timings, for pqR-2020-07-23 and R-4.0.2, compiled identically: - pqR-2020-07-23 in C locale: > set.seed(1) > N <- 100 > x <- as.character (sample(N,N,replace=TRUE)) > print(system.time (os <- order(x,method="shell"))) user system elapsed 1.332 0.000 1.334 > print(system.time (or <- order(x,method="radix"))) user system elapsed 0.092 0.004 0.096 > print(system.time (om <- order(x,method="merge"))) user system elapsed 0.363 0.000 0.363 > print(identical(os,or)) [1] TRUE > print(identical(os,om)) [1] TRUE > > x <- c("a","~") > print(order(x,method="shell")) [1] 1 2 > print(order(x,method="radix")) [1] 1 2 > print(order(x,method="merge")) [1] 1 2 - R-4.0.2 in C locale: > set.seed(1) > N <- 100 > x <- as.character (sample(N,N,replace=TRUE)) > print(system.time (os <- order(x,method="shell"))) user system elapsed 2.381 0.004 2.387 > print(system.time (or <- order(x,method="radix"))) user system elapsed 0.138 0.000 0.137 > #print(system.time (om <- order(x,method="merge"))) > print(identical(os,or)) [1] TRUE > #print(identical(os,om)) > > x <- c("a","~") > print(order(x,method="shell")) [1] 1 2 > print(order(x,method="radix")) [1] 1 2 > #print(order(x,method="merge")) pqR-2020-07-23 in fr_CA.utf8 locale: > set.seed(1) > N <- 100 > x <- as.character (sample(N,N,replace=TRUE)) > print(system.time (os <- order(x,method="shell"))) utilisateur système écoulé 2.960 0.000 2.962 > print(system.time (or <- order(x,method="radix"))) utilisateur système écoulé 0.083 0.008 0.092 > print(system.time (om <- order(x,method="merge"))) utilisateur système écoulé 1.143 0.000 1.142 > print(identical(os,or)) [1] TRUE > print(identical(os,om)) [1] TRUE > > x <- c("a","~") > print(order(x,method="shell")) [1] 2 1 > print(order(x,method="radix")) [1] 1 2 > print(order(x,method="merge")) [1] 2 1 R-4.0.2 in fr_CA.utf8 locale: > set.seed(1) > N <- 100 > x <- as.character (sample(N,N,replace=TRUE)) > print(system.time (os <- order(x,method="shell"))) utilisateur système écoulé 4.222 0.016 4.239 > print(system.time (or <- order(x,method="radix"))) utilisateur système écoulé 0.136 0.000 0.137 > #print(system.time (om <- order(x,method="merge"))) > print(identical(os,or)) [1] TRUE > #print(identical(os,om)) > > x <- c("a","~") > print(order(x,method="shell")) [1] 2 1 > print(order(x,method="radix")) [1] 1 2 > #print(order(x,method="merge")) pqR is faster in all the tests, but more relevant to this discussion is that the "merge" method is substantially faster than "shell" for these character vectors with a million elements, while retaining the stability and collation properties of "shell" (whereas "radix" only does C collation). It would probably not be too hard to take the merge sort code from pqR and use it in R core's implementation. The merge sort in pqR doesn't exploit parallelism at the moment, but merge sort is potentially quite parallelizable (though I think the storage allocation strategy I use would have to be modified). Regards, Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] svd for Large Matrix
> Dario Strbenac writes: > > I have a real scenario involving 45 million biological cells > (samples) and 60 proteins (variables) which leads to a segmentation > fault for svd. I thought this might be a good example of why it > might benefit from a long vector upgrade. Rather than the full SVD of a 4500x60 X, my guess is that you may really only be interested in the eigenvalues and eigenvectors of X^T X, in which case eigen(t(X)%*%X) would probably be much faster. (And eigen(crossprod(X)) would be even faster.) Note that if you instead want the eigenvalues and eigenvectors of X X^T (which is an enormous matrix), the eigenvalues of this are the same as those of X^T X, and the eigenvectors are Xv, where v is an eigenvector of X^T X. For example, with R 4.0.2, and the reference BLAS/LAPACK, I get > X<-matrix(rnorm(10),1,10) > system.time(for(i in 1:1000) rs<-svd(X)) user system elapsed 2.393 0.008 2.403 > system.time(for(i in 1:1000) re<-eigen(crossprod(X))) user system elapsed 0.609 0.000 0.609 > rs$d^2 [1] 10568.003 10431.864 10318.959 10219.961 10138.025 10068.566 9931.538 [8] 9813.841 9703.818 9598.532 > re$values [1] 10568.003 10431.864 10318.959 10219.961 10138.025 10068.566 9931.538 [8] 9813.841 9703.818 9598.532 Possibly some other LAPACK might implement svd better, though I suspect that R will allocate more big matrices than really necessary for the svd even aside from whatever LAPACK is doing. Regards, Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] string concatenation operator (revisited)
> The TL;DR version is base R support for a `+.character` method. This > would essentially provide a shortcut to `paste0`... In pqR (see pqR-project.org), I have implemented ! and !! as binary string concatenation operators, equivalent to paste0 and paste, respectively. For instance, > "hello" ! "world" [1] "helloworld" > "hello" !! "world" [1] "hello world" > "hello" !! 1:4 [1] "hello 1" "hello 2" "hello 3" "hello 4" This seems preferable to overloading the + operator, which would lead to people reading code wondering whether a+b is doing an addition or a string concatenation. There are very few circumstances in which one would want to write code where a+b might be either of these. So it's better to make clear what is going on by having a different operator for string concatenation. Plus ! and !! semm natural for representing paste0 and paste, whereas using ++ for paste (with + for paste0) would look rather strange. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] string concatenation operator (revisited)
> > In pqR (see pqR-project.org), I have implemented ! and !! as binary > > string concatenation operators, equivalent to paste0 and paste, > > respectively. > > > > For instance, > > > > > "hello" ! "world" > > [1] "helloworld" > > > "hello" !! "world" > > [1] "hello world" > > > "hello" !! 1:4 > > [1] "hello 1" "hello 2" "hello 3" "hello 4" > > I'm curious about the details: > > Would `1 ! 2` convert both to strings? They're equivalent to paste0 and paste, so 1 ! 2 produces "12", just like paste0(1,2) does. Of course, they wouldn't have to be exactly equivalent to paste0 and paste - one could impose stricter requirements if that seemed better for error detection. Off hand, though, I think automatically converting is more in keeping with the rest of R. Explicitly converting with as.character could be tedious. I suppose disallowing logical arguments might make sense to guard against typos where ! was meant to be the unary-not operator, but ended up being a binary operator, after some sort of typo. I doubt that this would be a common error, though. (Note that there's no ambiguity when there are no typos, except that when negation is involved a space may be needed - so, for example, "x" ! !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE". Existing uses of double negation are still fine - eg, a <- !!TRUE still sets a to TRUE. Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not "xTRUE".) > Where does the binary ! fit in the operator priority? E.g. how is > > a ! b > c > > parsed? As (a ! b) > c. Their precedence is between that of + and - and that of < and >. So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE. (Actually, pqR also has a .. operator that fixes the problems with generating sequences with the : operator, and it has precedence lower than + and - and higher than ! and !!, but that's not relevant if you don't have the .. operator.) Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Announcing pqR - a faster version of R
I have released a new, faster, version of R, which I call pqR (for "pretty quick" R), based on R-2.15.0. Among many other improvements, pqR supports automatic use of multiple cores to perform numerical computations in parallel with other numerical computations, and with the interpretive thread. It also implements a true reference counting scheme to reduce the amount of unnecessary duplication of objects. There are also substantial speed ups in general interpretive overhead, and in particular operations. Readers of r-devel can try out pqR by downloading a source tarball from radfordneal.github.io/pqR (only tested on Linux/Unix so far). The source repository is github.com/radfordneal/pqR - look in the MODS file to see how the changes from R-2.15.0 are organized. The R Core Team may wish to look at the list of bugs fixed in pqR, in the NEWS file, since many of them are present in their current version of R. I will be making a series of posts discussing pqR at my blog, which is at radfordneal.wordpress.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] missing PROTECT() in src/main/arithmetic.c
> From: Herv? Pag?s > at lines 651 & 653 (integer_binary function): > > if (code == DIVOP || code == POWOP) > ans = allocVector(REALSXP, n); > else > ans = allocVector(INTSXP, n); > > There are calls to warningcall() later in the function, which can > trigger garbbage collection. > > Looks like the typical scenario where it seemed pretty safe to not > PROTECT in the original version of the function but became very > unsafe 3 years later when the calls to warningcall() were added to > the function. Note that there is also a problem with a possible warning from myfmod, which in turn is called from R_pow. The call of myfmod from R_pow should probably be replaced by something else, since as it is, the following undesirable behaviour occurs: > (-Inf)^(1e16) [1] Inf Warning message: probable complete loss of accuracy in modulus I think issuing a warning for this is probably not a good idea, but if a warning is issued, it certainly shouldn't be this one. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Compiler warning: function returns address of local variable
> > ..., my previous > > dev-lang/R-2.10.1 ebuild package has been upgraded to R-3.0.1. > > While compiling it, I have got the following compiler warning: > > > > * QA Notice: Package triggers severe warnings which indicate that it > > *may exhibit random runtime failures. > > * main.c:1548:5: warning: function returns address of local variable > > [enabled by default] > > > > * Please do not file a Gentoo bug and instead report the above QA > > * issues directly to the upstream developers of this software. > > * Homepage: http://www.r-project.org/ > That is the whole point of that piece of C code, so please do report the > Gentoo bug (their message is simply bogus) to them. > > And also read posting guide before posting: HTML mail is not allowed here. > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk The message is certainly not "simply bogus". Returning the address of a local variable is almost always a bug. In this case, there is no real bug in R-3.0.1, since the result is used for the low-level, system-dependent operation of determining the direction of stack growth, but this is a very rare use. For the compiler to issue such a warning is fully justified. The warning could be avoided by using the approach taken in pqR, which has the following routine in main.c: /* Detemine whether the stack grows down (+1) or up (-1). Passed the address of a local variable in the caller's stack frame. This is put here, though called only from system.c, so that the compiler will not be able to inline it. */ int R_stack_growth_direction (uintptr_t cvaraddr) { int dummy; return (uintptr_t) &dummy < cvaraddr ? 1 : -1; } __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Dynamic list creation (SEXP in C) returns error "unimplemented type (29) in 'duplicate'" Dynamic list creation (SEXP in C) returns error
"changing its size using SETLENGTH (Rf_lengthgets)" NO! NO! NO! SETLENGTH does not do the same thing as Rf_lengthgets. You should never use SETLENGTH, which is not part of the API, and really should not even exist. For that matter Rf_lengthgets is also not part of the API. I recommend getting the length right to start with. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Dynamic list creation (SEXP in C) returns error "unimplemented type (29) in 'duplicate'" Dynamic list creation (SEXP in C) returns error
On Wed, Nov 06, 2013 at 02:40:59PM -0300, George Vega Yon wrote: > Hi! You are right, what I actually use is SET_LENGTH... Is that ok? > El 06/11/2013 14:00, "Radford Neal" escribi?: Is SET_LENGTH a documented feature of the API? Not that I can see. However, it is indeed different from SETLENGTH. Perhaps your problem is in the following code you posted: REPROTECT(SET_LENGTH(id,length(lambda) + z), ipx0); REPROTECT(SET_LENGTH(lambda,length(lambda) + z), ipx1); It looks like the first line should have length(id), not length(lambda). Your usage of REPROTECT and duplicate seems odd. If you make the stuff you allocate part of your original list as soon as you allocate it, you only need to protect the original list. And it's not clear why you think you need to duplicate vectors. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Varying results of package checks due to random seed
> From: Philippe GROSJEAN > > ... for latest CRAN version, we have successfully installed 4999 > packages among the 5321 CRAN package on our platform. ... It is > strange that a large portion of R CMD check errors on CRAN occur and > disappear *without any version update* of a package or any of its > direct or indirect dependencies! That is, a fraction of errors or > warnings seem to appear and disappear without any code update. Some of these are likely the result of packages running tests using random number generation without setting the random numbers seed, in which case the seed is set based on the current time and process id, with an obvious possibility of results varying from run to run. In the current development version of pqR (in branch 19-mods, found at https://github.com/radfordneal/pqR/tree/19-mods), I have implemented a change so that if the R_SEED environment variable is set, the random seed is initialized to its value, rather than from the time and process id. This was motivated by exactly this problem - I can now just set R_SEED to something before running all the package checks. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 133, Issue 23
> From: Richard Cotton > > The rep function is very versatile, but that versatility comes at a > cost: it takes a bit of effort to learn (and remember) its syntax. > This is a problem, since rep is one of the first functions many > beginners will come across. Of the three main uses of rep, two have > simpler alternatives. > > rep(x, times = ) has rep.int > rep(x, length.out = ) has rep_len > > I think that a rep_each function would be a worthy addition for the > third use case > > rep(x, each = ) > > (It might also be worth having rep_times as a synonym for rep.int.) I think this is exactly the wrong approach. Indeed, the aim should be to get rid of functions like rep.int (or at least discourage their use, even if they have to be kept for compatibility). Why is rep_each(x,n) better than rep(x,each=n)? There is no saving in typing (which would be trivial anyway). There *ought* to be no significant difference in speed (though that seems to have been the motive for rep.int). Are you trying to let students learn R without ever learning about specifying arguments by name? And where would you stop? How about seq_by(a,b,s) rather than having to arduously type seq(a,b,by=s)? Maybe we should have glm_binomial, glm_poisson, etc. so we don't have to remember the "family" argument? This way lies madness... Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Unfixed bugs in latest R-patched
ded. The call > apply(array(1:24,c(2,3,4)),-3,sum) now produces correct results (the same as when MARGIN is 1:2). o Fixed a bug in which Im(matrix(complex(0),3,4)) returned a matrix of zero elements rather than a matrix of NA elements. o Now check for bad arguments for .rowSums, .colSums, .rowMeans, and .rowMeans (would previously segfault if n*p too big). o Fixed a bug where excess warning messages may be produced on conversion to RAW. For instance: > as.raw(1e40) [1] 00 Warning messages: 1: NAs introduced by coercion 2: out-of-range values treated as 0 in coercion to raw Now, only the second warning message is produced. o A bug has been fixed in which rbind would not handle non-vector objects such as function closures, whereas cbind did handle them, and both were documented to do so. Regards, Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Unfixed bugs in latest R-patched
On Tue, Jun 24, 2014 at 09:20:30AM +0200, Duncan Murdoch wrote: > It's hardly fair to call these "unfixed bugs". Most of them are better > described as "unreported bugs", or, after this posting, "bugs reported > with insufficient detail". We can't fix them if we don't know what they > are, and it's not easy to determine that for most of the items below. I think most of them are pretty clear. For the ones referring to lack of protection or argument error checking code, I suggesting doing a "diff" on the source files to find where I made changes. If your complaint is that your time is more valuable than my time, so I should be producing a nice patch for you, then I don't agree. I went to the effort of verifing that the bugs are still present in r66002. That's as much as I'm willing to do. It's up to you what you do. Radford Neal > On 23/06/2014, 5:24 PM, Radford Neal wrote: > > A new version of pqR is now available at pqR-project.org, which fixes > > several bugs that are also present in the latest R Core patch release > > (r66002). A number of bugs found previously during pqR development > > are also unfixed in the latest R Core release. Here is the list of > > these bugs that are unfixed in r66002 (including documentation > > deficiencies), taken from the pqR bug fix and documentation update > > sections of the pqR NEWS file (pqR-project.org/NEWS.txt): > > > > o The documentation for c now says how the names for the result are > > determined, including previously missing information on the > > use.names argument, and on the role of the names of arguments in > > the call of c. This documentation is missing in R-2.15.0 and > > R-3.1.0. > > > > o The documentaton for diag now documents that a diagonal matrix is > > always created with type double or complex, and that the names of > > an extracted diagonal vector are taken from a names attribute (if > > present), if not from the row and column names. This information > > is absent in the documentation in R-2.15.1 and R-3.1.0. > > > > o Incorrect information regarding the pointer protection stack was > > removed from help(Memory). This incorrect information is present > > in R-2.15.0 and R-3.1.0 as well. > > > > o There is now information in help(Arithmetic) regarding what > > happens when the operands of an arithmetic operation are NA or > > NaN, including the arbitrary nature of the result when one > > operand is NA and the other is NaN. There is no discussion of > > this issue in the documentation for R-2.15.0 and R-3.1.0. > > > > o Fixed lack of protection bugs in the equal and greater functions > > in sort.c. These bugs are also present in R-2.15.0 and R-3.1.0. > > > > o Fixed lack of protection bugs in the D function in deriv.c. > > These bugs are also present in R-2.15.0 and R-3.1.0. > > > > o Fixed argument error-checking bugs in getGraphicsEventEnv and > > setGraphicsEventEnv (also present in R-2.15.0 and R-3.1.0). > > > > o Fixed the following bug (also present in R-2.15.0 and R-3.0.2): > > > > x <- t(5) > > print (x %*% c(3,4)) > > print (crossprod(5,c(3,4))) > > > > The call of crossprod produced an error, whereas the > > corresponding use of %*% does not. > > > > In pqR-2013-12-29, this bug also affected the expression t(5) %*% > > c(3,4), since it is converted to the equivalent of > > crossprod(5,c(3,4)). > > > > o Fixed a problem in R_AllocStringBuffer that could result in a > > crash due to an invalid memory access. (This bug is also present > > in R-2.15.0 and R-3.0.2.) > > > > o The documentation for aperm now says that the default method does > > not copy attributes (other than dimensions and dimnames). > > Previously, it incorrecty said it did (as is the case also in > > R-2.15.0 and R-3.0.2). > > > > o Fixed the following bug (also present in R-2.15.0 and R-3.0.2): > > > > v <- c(1,2) > > m <- matrix(c(3,4),1,2) > > print(t(m)%*%v) > > print(crossprod(m,v)) > > > > in which crossprod gave an error rather than produce the answer > > for the corresponding use of %*%. > > > > o Raising -Inf to a large value (eg, (-Inf)^(1e16)) no longer > > produces an incom
Re: [Rd] Unfixed bugs in latest R-patched
> Duncan Murdoch: > > No, I don't think it's reasonable to expect you to write a patch, but > reporting the bugs in the R bug reporting system isn't that hard to do, > and does lead to fixes pretty rapidly in cases where the report contains > sample code to reproduce the problem. Sometimes. Sometimes not. For instance, PR #14985, a significant set of bugs, with an easy fix (patch provided), which took almost two years to make it to an R Core release - perhaps because you were more interested in trying to argue that they weren't really bugs at all... > "Fixed a problem in R_AllocStringBuffer that could result in a crash due > to an invalid memory access" sounds serious, but is just too vague to > follow up. I would expect that doing a diff on the source files is > going to find all sorts of stuff: pqR isn't just R with bugs fixed, it > has a lot of other changes too. You might expect that if it was really that difficult, I would have given more detail. I think if you actually looked at this procedure, which is about 30 lines long, you might, seeing as you've been warned that it has a bug, find the problem in about 30 seconds, even without looking at the pqR source code, which of course isn't difficult to do. > it would be more > helpful to the community if the bugs actually got fixed. Indeed. > I think all of > the bugs that you reported last June got fixed within a couple of weeks Actually, the last six in the list I just posted are from the original pqR release last June. Some of the six don't seem too crucial, but two of them seem reasonably serious (one leads to R crashing). > Why not report them > more frequently than annually, and give the details you already have so > they are easier to fix? I did report at least one bug not long ago (which got fixed), after seeing (as now) that an R Core release was imminent, and therefore thinking it would be best if a fix was put in before it went out. You're of course welcome to look at the NEWS file at pqR-project.org whenever you like. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Unfixed bugs in latest R-patched
> > o Fixed the following bug (also present in R-2.15.0 and R-3.0.2): > > > x <- t(5) > > print (x %*% c(3,4)) > > print (crossprod(5,c(3,4))) > > > The call of crossprod produced an error, whereas the > > corresponding use of %*% does not. > > > o Fixed the following bug (also present in R-2.15.0 and R-3.0.2): > > > v <- c(1,2) > > m <- matrix(c(3,4),1,2) > > print(t(m)%*%v) > > print(crossprod(m,v)) > > > in which crossprod gave an error rather than produce the answer > > for the corresponding use of %*%. > As S and R's terminology has always been mathematically correct and > "knows" that a vector is not a 1-row matrix or 1-column matrix > (which some people call row-vector and column-vector), R has > indeed much more freedom to be lenient in how to promote vectors > to matrices, and it has always done so tolerantly for some cases > of "%*%", but even there is room for discussion, see below. > However, one can argue that the current behavior is quite > consistent {there is more than one kind of consistency here; in > this case it's the consistency of underlying C code IIRC}, and > not strictly a bug. If you look at help(crossprod), you'll see that it says that crossprod(x,y) is equivalent to t(x) %*% y. But it's not, since the latter works in some cases where the former gives an error. So it's a bug. Furthermore, crossprod(x,y) ought to be equivalent to t(x) %*% y, even in cases where a vector needs to get promoted to a matrix, because such cases can arise from inadvertant dropping of dimensions when subscriptiong a matrix (eg, with 1:n, with n happening to be 1). Sometimes, this problem gets fixed by the transpose. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Need help on calling Head from C
>> PROTECT(dfm=lang3(install("data.frame"),df,ScalarLogical(FALSE))); >> SET_TAG(CDDR(dfm), install("stringsAsFactors")) ; >> SEXP res = PROTECT(eval(dfm,R_GlobalEnv)); >> PROTECT(head=lang3(install("head"),res,ScalarInteger(1))); >> head = PROTECT(eval(head,R_GlobalEnv)); >> >> >> I tried the above following , it seemed to be not working . Can you please >> help. >> > >Can you elaborate? The above code works AFAICT ... The code is actually not safe. Both "install" and "SalarLogical/Integer" potentially allocate memory, so at least one needs to be protected before callling lang3. (Passing one such argument would be OK, since lang3 protects its arguments, but it doesn't get a chance to do that while they are still being evaluated.) Now, I'm not sure this is the source of the actual problem, since both "data.frame" and "head" presumably already exist, so the install won't actually allocate memory. But this is not a safe coding method in general. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Need help on calling Head from C
> Thank you for the wonderful suggestion . do you suggest to protect > ScalarInteger(1) > before calling lang3 ? Yes, either the result of Scalar Integer(1) or the result of install should be protected (before calling lang3 - don't try to do it inside the argument list of lang3). __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Need help on calling Head from C
> >The code is actually not safe. Both "install" and "SalarLogical/Integer" > >potentially allocate memory, so at least one needs to be protected before > >callling lang3. (Passing one such argument would be OK, since lang3 > >protects its arguments, but it doesn't get a chance to do that while they > >are still being evaluated.) > > How true but who can blame him? The Writing R Extensions manual > contains the same mistake: > > SEXP mkans(double x) > { > SEXP ans; > ans = PROTECT(allocVector(REALSXP, 1)); > REAL(ans)[0] = x; > UNPROTECT(1); > return ans; > } > > double feval(double x, SEXP f, SEXP rho) > { > defineVar(install("x"), mkans(x), rho); > return REAL(eval(f, rho))[0]; > } A further comment on this... Currently, symbols are never garbage collected, so you might think the above is OK, since the result of install("x") can't be lost wen mkans(x) is called. However, I think it's not a good idea to rely on symbols never being collected. In any case, though, even if you are relying on that, the code isn't safe because C doesn't specify the order of evaluation of argments, so mkans(x) might be called before install("x"). One should also note that the PROTECT within mkans is unnecessary, and must surely be confusing to anyone who thought (correctly) that they had understood what PROTECT is for. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Need help on calling Head from C
> >> SEXP mkans(double x) > >> { > >> SEXP ans; > >> ans = PROTECT(allocVector(REALSXP, 1)); > >> REAL(ans)[0] = x; > >> UNPROTECT(1); > >> return ans; > >> } > >One should also note that the PROTECT within mkans is unnecessary, > >and must surely be confusing to anyone who thought (correctly) > >that they had understood what PROTECT is for. > > I understand what PROTECT is for and I don't find the PROTECT in mkans > confusing. > > Maybe it's not necessary now. But it's so much simpler/safer to just > systematically protect any freshly allocated SEXP. One day > someone might need to modify mkans(), say, add a call to warning() or > Rprintf() after the call to allocVector(), and will most likely forget > to add the PROTECT/UNPROTECT that will then become necessary. There's certainly something to be said in favour of the "better safe than sorry" approach to using PROTECT. But in the context of a tutorial, one shouldn't put one in that's unnecessary without a comment saying, "not needed at the moment, but would be if code that allocates more memory is added later, so let's be safe and do it now too". Otherwise, the reader may infer an incorrect model of when to PROTECT, such as "you should PROTECT every allocated object, then UNPROTECT at the end of the procedure - do that and you'll be OK". Of course, that's not enough to be OK. And to illustrate the correct model, one needs some negative examples as well as positive examples. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] arguments to .Call(), .External should be read-only?
> hi. if i'm reading correctly, "Writing R Extensions" appears to be > inconsistent on the question of whether the arguments passed to a > routine called via .Call() or .External() should considered read-only. There are two situations to consider. 1) You want to modify the arguments as a convenience in your computation, but in the end you will return information to the caller via the value returned from .Call. 2) You want to modify the arguments as a way of returning information to the caller. For instance: x <- numeric(100) # x contains zeros .Call("myfunction",x) # use new values now present in x For (1), if you want to modify a argument x, you just need to first check whether NAMED(x) is non-zero. If it is, you need to execute x=duplicate(x) (and probably PROTECT the result) before changing x. You can return x (the original or the duplicate) as the result of the .Call if you like, or just use it as an intermediate value in the computation. For (2), you need to be sure that the value passed to .Call is unshared. (Checking NAMED(x) and duplicating if it is non-zero won't work, since then the caller doesn't get to see the information put in the duplicate of x.) Unfortunately, there is no documentation on when objects are guaranteed to be unshared. The technique of (2) is used sometimes in standard R packages. For example, the "arima" function in the "stats" package calls ARIMA_Like and modifies its arguments as a way of returning information. It's presumably also used in many contributed packages. Since it's probably too much to eliminate all uses of (2), I think documentation on when an object is guaranteed to be unshared is needed. I propose that the following be guaranteed to produce an unshared object: 1) The vector creation functions: vector, numeric, integer, logical, complex, raw, and character. 2) The array creation functions: array and matrix. 3) The rep and rep.int functions, which are sometimes used in place of a vector creation function - eg, rep(0,10) rather than numeric(10). I think it should be assumed that any other function might return an unshared result (eg, 1-1 might return a shared zero object). Of course, the results from the above functions might become shared later, too (if assigned to more than one variable, for instance), but code like that above for .Call("myfunction",x) would be guaranteed to work. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] diag(x, n) not preserving integer and logical x
Martin Maechler wrote: > diag(x) preserves the storage mode of x for 'complex' and > 'double' precision, but converts integer and logicals to double : Duncan Murdoch wrote: > I think the change to preserve integer makes sense, but preserving > logical does not. A diagonal matrix has zeros off the diagonal, and > they are not logical. Having diag() sometimes return a matrix with > FALSE off the diagonal just looks wrong. Making diag(1:10) return an integer matrix does make sense, but it could still be undesirable. Often, the user will have intended to get a real matrix, and will after another operation, but only after an unnecessary coercion is done. For example: x<-diag(1:10); x[1,2]<-2.0 I personally think keeping diag(1:10) integer is best, but I thought I'd point out the contrary argument. Having diag(rep(TRUE,10)) return a logical matrix with FALSE off the diagonal does not strike me as too odd. It makes sense if you think of (FALSE, TRUE) as forming a field under operations of sum=XOR and product=AND. And of course it will typically be converted to a 0/1 matrix later if that's what was actually intended. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] New flag bit for serialized used by pqR
I will shortly be releasing a new version of pqR (you can get a test version from pqR-project.org now - scroll to the bottom of the page). One new feature in this version requires adding a bit to the flags written out when data is serialized. I thought I'd let you know about this so as to avoid any possible conflicts. The new feature is that a few R objects are defined as constants, which are probably put in read-only memory by the C compiler. These include NULL, TRUE, FALSE, NA, 0L, 1L, ... 10L, 0.0, 1.0, and the one-element pairlists containing these. Apart from NULL, these constants are not guaranteed to be used for all instances of these values, but they often are. When data is serialized and then read back in, I'd like for the occurrences of these constants to be re-created as constants. One might always use these constants when reading in data, but I'm not doing that now because I'm not confident that there is no code relying on certain instances of these objects being unshared. So I write out the constants with a flag bit saying they are constants, and re-create them as constants only if this bit is set (and they have constant versions in the current implementation). Old workspaces will never have this bit set, so nothing will be re-created as constants (except NULL). If a workspace with some of these constant flag bits set is read by an old version of R, the flag bits will just be ignored (by UnpackFlags in serialize.c), so the objects will be restored the same as if they had been written by such an old version. So this should all work fine unless R Core implementations start using this bit for something else. (Or unless some old version of R used it for something else - which isn't the case as far as I can tell, but please let me know if you know of such usage.) There are four more unused bits in the 32-bit word that is written out, plus two more could be scrounged by storing the "type" in six bits rather than eight, so there doesn't seem to be an immediate shortage of bits. The relevant declarations in serialize.c are as follows: /* * Type/Flag Packing and Unpacking * * To reduce space consumption for serializing code (lots of list * structure) the type (at most 8 bits), several single bit flags, * and the sxpinfo gp field (LEVELS, 16 bits) are packed into a single * integer. The integer is signed, so this shouldn't be pushed too * far. It assumes at least 28 bits, but that should be no problem. */ #define IS_OBJECT_BIT_MASK (1 << 8) #define HAS_ATTR_BIT_MASK (1 << 9) #define HAS_TAG_BIT_MASK (1 << 10) #define IS_CONSTANT_MASK (1 << 11) /* <<--- added in pqR */ #define ENCODE_LEVELS(v) ((v) << 12) #define DECODE_LEVELS(v) ((v) >> 12) #define DECODE_TYPE(v) ((v) & 255) Please let me know if you see any problem with this, or if for some reason you'd prefer that I use one of the other four available bits (in the top of the 32-bit word). Regards, Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] ScalarLogical and setAttrib
> From: Jeroen Ooms > > It seems like ScalarLogical returns a singleton object, which is not > the case for ScalarInteger or ScalarReal. I am currently working > around this using duplicate(ScalarLogical(0)), but was quite surprised > by this behavior of ScalarLogical. > From: Hadley Wickham > > I believe this is by design (and changed relatively recently). FALSE and > TRUE are singletons, like NULL. No, TRUE and FALSE aren't like NULL. There's only ever one NULL, but there can be many TRUE and FALSE objects. For example, see below, done with R-3.1.2: > .Internal(inspect(TRUE)) @2d84488 10 LGLSXP g0c1 [NAM(2)] (len=1, tl=0) 1 > .Internal(inspect(TRUE|FALSE)) @2d84308 10 LGLSXP g0c1 [] (len=1, tl=0) 1 The problem is a combination of not documenting exactly what ScalarLogical is supposed to do, and then changing what it does. I think it's pretty reasonable for people to have assumed they could attach an attribute to the result of ScalarLogical, given the lack of any explicit documentation. Of course, paranoid users would have checked NAMED on the result to see it they need to duplicate it, but the documentation on that is less explicit than it might be as well. This is why pqR still returns an unshared object for ScalarLogical. Internally, pqR has a ScalarLogicalMaybeShared function that returns the shared version, which is in read-only memory so that any attempt to change it will cause an error. (Similarly, there are ScalarRealMaybeShared, etc.) Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Error "promise already under evaluation ..."
> The above question still stands, but otherwise, I overlooked the most > obvious solution: > > dim_1 <- function(x) dim(x) > > Unit: nanoseconds > expr minlq mean medianuq max neval cld > dim(x) 0 172.941 1 12696 1000 a > base::dim(x) 11549 13474 15105.950 14245 15399 60824 1000 c > dim_1(x) 1 771 2801.544771 1156 1806225 1000 a > dim_R(x) 5390 6930 8077.753 7315 8085 249069 1000 b > dim_R_memoized(x) 1156 1926 2520.119 2310 2695 73528 1000 a >dim_R_memoized_2(x) 385 771 1089.243771 1156 20019 1000 a > dim_illegal(x) 0 1 161.480 1 3862311 1000 a > sum(x) 10395 15784 16459.454 15785 16169 114333 1000 c > > So, my best shot on the original problem would now be to either use: > > dim2 <- function(x) dim(x) > foo <- function(x, dim=dim2(x)) { dim } > > or simply avoid the name clash via: > > foo <- function(x, dimx=dim(x)) { dimx } I think you'll find that baseenv()$dim(x) and .BaseNamespaceEnv$dim(x) are about 25 times faster than base::dim(x). This doesn't seem like it should be necessary... Radford __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] vapply definition question
> so I guess you're looking at a modified version of the function... The > implementation detail is in the comment -- FUN(X[i], ...) is evaluated in the > frame of lapply. > > Martin Morgan You may find it interesting that in the latest version of pqR, lapply is made faster in the case where no ... is used by then not actuallly passing ... to FUN, which significatly reduces call overhead. See the code at https://github.com/radfordneal/pqR/blob/Release-2014-11-16/src/main/apply.c __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Unexpected behavior of debug() in step-wise mode
41;309;0c> Why does debug() enter Browse[3] here at all, and why does it happen the > first time and not the second? This seems unexpected to me, and has > undesirable effects for ESS users (that I reported here - > https://stat.ethz.ch/pipermail/ess-help/2013-June/009154.html - but just > realized my post to r-devel didn't make it through when I tried to report > it back then). > > > Fun <- function(n) print(n) > > debug(Fun) > > Fun(2) > debugging in: Fun(2) > debug: print(n) > Browse[2]> {n+2} > debug at #1: n + 2 I think this is the result of debugging being implemented by setting a flag in the environment of the function being debugged. That's how the debug status gets passed on to inner statements. Your {n+2} expression looks like an inner statement, and of course has the environment of the call of Fun. (In contrast, (n+2) doesn't do this, because "(" doesn't check for debugging.) You could avoid it with some hack like using (function(){n+2})() rather than {n+2}. (I'm assuming there's some reason you're not just using (n+2), with parentheses rather than curly brackets). I've considered changing how this works in pqR, using pqR's "variant return" mechanism to pass on the information that an inner statement should be debugged. This would also allow one to avoid entering the debugger for "if" when it is used as an expression rather than a statement (for example, a <- if (x>0) 3 else 4). Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Recycling memory with a small free list
> ... with assignments inside of loops like this: > > reweight = function(iter, w, Q) { > for (i in 1:iter) { > wT = w * Q > } > } > ... before the RHS is executed, the LHS allocation would be added > to a small fixed length list of available space which is checked > before future allocations. If the same size is requested before the > next garbage collection, the allocation is short-circuited and the > allocation is reused. This list could be very small, possibly even > only a single entry. Entries would only be put on the list if they > have no other references. Reusing the LHS storage immediately isn't possible in general, because evaluation of the RHS might produce an error, in which case the LHS variable is supposed to be unchanged. Detecting special cases where there is guaranteed to be no error, or at least no error after the first modification to newly allocated memory, might be too complicated. Putting the LHS storage on a small free list for later reuse (only after the old value of the variable will definitely be replaced) seems more promising (then one would need only two copies for examples such as above, with them being used in alternate iterations). However, there's a danger of getting carried away and essentially rewriting malloc. To avoid this, one might try just calling "free" on the no-longer-needed object, letting "malloc" then figure out when it can be re-used. Unfortunately, that seems not to be safe, because it's posslble that there is a reference to the no-longer-needed object on the PROTECT stack, even though no one should actually be looking at it any more. In the current version of pqR (see pqR-project.org), modifications are (often) done in place for statements such as w = w * Q, but not curretly when the LHS variable does not appear on the RHS. Regards, Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Recycling memory with a small free list
Radford Neal: > > there's a danger of getting carried away and essentially rewriting > > malloc. To avoid this, one might try just calling "free" on the > > no-longer-needed object, letting "malloc" then figure out when it can > > be re-used. Nathan Kurz: > Yes, I think that's what I was anticipating: add a free() equivalent... Radford Neal: > > Unfortunately, that seems not to be safe, because it's > > possible that there is a reference to the no-longer-needed object on > > the PROTECT stack, even though no one should actually be looking at > > it any more. Nathan Kurz: > Can you explain this case? I don't think I understand it. My comment about it not being safe was referring to actually calling the "free" function in the standard C library, not a "free equivalent". Except for small objects, R calls "free" when it concludes that it no longer needs an object, having allocated space for it earlier with "malloc". After "free" is called, R had better not do anything like try to mark it in the garbage collection phase... Keeping a free list apart from the one maintained by malloc/free would I think have to be how it is done, hence my comment about ending up rewriting malloc/free. But it may not be too hard to restrain oneself and only do the simplest things. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] iterated lapply
From: Daniel Kaschek > ... When I evaluate this list of functions by > another lapply/sapply, I get an unexpected result: all values coincide. > However, when I uncomment the print(), it works as expected. Is this a > bug or a feature? > > conditions <- 1:4 > test <- lapply(conditions, function(mycondition){ > #print(mycondition) > myfn <- function(i) mycondition*i > return(myfn) > }) > > sapply(test, function(myfn) myfn(2)) From: Jeroen Ooms > I think it is a bug. If we use substitute to inspect the promise, it > appears the index number is always equal to its last value: From: Duncan Temple Lang > Not a bug, but does surprise people. It is lazy evaluation. I think it is indeed a bug. The lapply code saves a bit of time by reusing the same storage for the scalar index number every iteration. This amounts to modifying the R code that was used for the previous function call. There's no justification for doing this in the documentation for lapply. It is certainly not desired behaviour, except in so far as it allows a slight savings in time (which is minor, given the time that the function call itself will take). Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] iterated lapply
I think the discussion of this issue has gotten more complicated than necessary. First, there really is a bug. You can see this also by the fact that delayed warning messages are wrong. For instance, in R-3.1.2: > lapply(c(-1,2,-1),sqrt) [[1]] [1] NaN [[2]] [1] 1.414214 [[3]] [1] NaN Warning messages: 1: In FUN(c(-1, 2, -1)[[3L]], ...) : NaNs produced 2: In FUN(c(-1, 2, -1)[[3L]], ...) : NaNs produced The first warning message should have "1L" rather than "3L". It doesn't because lapply made a destructive change to the R expression that was evaluated for the first element. Throughout the R interpreter, there is a general assumption that expressions that are or were evaluated are immutable, which lapply is not abiding by. The only question is whether the bugs from this are sufficiently obscure that it's worth keeping them for the gain in speed, but the speed cost of fixing it is fairly small (though it's not negligible when the function applied is something simple like sqrt). The fix in the C code for lapply, vapply, and eapply is easy: Rather than create an R expression such as FUN(X[[1L]]) for the first function call, and then modify it in place to FUN(X[[2L]]), and so forth, just create a new expression for each iteration. This requires allocating a few new CONS cells each iteration, which does have a cost, but not a huge one. It's certainly easier and faster than creating a new environment (and also less likely to cause incompatibilities). The R code for apply can be changed to use the same approach, rather than using expressions such as FUN(X[i,]), where i is an index variable, it can create expressions like FUN(X[1L,]), then FUN(X[2L,]), etc. The method for this is simple, like so: > a <- quote(FUN(X[i,])) # "i" could be anything > b <- a; b[[c(2,3)]] <- 1L # change "i" to 1L (creates new expr) This has the added advantage of making error messages refer to the actual index, not to "i", which has no meaning if you haven't looked at the source code for apply (and which doesn't tell you which element led to the error even if you know what "i" does). I've implemented this in the development version of pqR, on the development branch 31-apply-fix, at https://github.com/radfordneal/pqR/tree/31-apply-fix The changes are in src/main/apply.R, src/main/envir.R, and src/library/base/R/apply.R, plus a new test in tests/apply.R. You can compare to branch 31 to see what's changed. (Note rapply seems to not have had a problem, and that other apply functions just use these, so should be fixed as well.) There are also other optimizations in pqR for these functions but the code is still quite similar to R-3.1.2. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] iterated lapply
> There are other instances, such as Reduce where there is a bug > report pending that amounts to the same issue. Performing surgery on > expressions and calling eval is not good practice at the R level and > probably not a good idea at the C level either. It is worth thinking > this through carefully before a adopting a solution, which is what we > will be doing. Surgery on expressions is what lapply does at the moment. My change makes it no longer do that. There is a general problem that lazy evaluation can have the effect of making the internal details of how an R function like "apply" is implemented leak into its semantics. That's what's going on with the Reduce bug (16093) too. I think one can avoid this by defining the following function for calling a function with evaluation of arguments forced (ie, lazy evaluation disabled): call_forced <- function (f, ...) { list (...); f (...) } (Of course, for speed one could make this a primitive function, which wouldn't actually build a list.) Then the critical code in Reduce could be changed from for (i in rev(ind)) init <- f(x[[i]], init) to for (i in rev(ind)) init <- call_forced (f, x[[i]], init) If one had a primitive (ie, fast) call_forced, a similar technique might be better than the one I presented for fixing "apply" (cleaner, and perhaps slightly faster). I don't see how it helps for functions like lapply that are written in C, however (where help isn't needed, since there's nothing wrong with the mod in my previous message). Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Why is the diag function so slow (for extraction)?
> From: Martin Maechler > diag() should not work only for pure matrices, but for all > "matrix-like" objects for which ``the usual methods'' work, such > as >as.vector(.), c(.) > > That's why there has been the c(.) in there. > > You can always make code faster if you write the code so it only > has to work in one special case .. and work there very fast. help(diag) gives no hint whatever that diag(x) will work for objects that are "matrix-like", but aren't actual matrices. help(c) explicitly says that methods for it are NOT required to convert matrices to vectors. So you're advocating slowing down all ordinary uses of diag to accommodate a usage that nobody thought was important enough to actually document. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Performance issues with R2.15.1 built from source on Solaris
Is this a SPARC system? On at least some SPARC systems, the "long double" type in C is implemented very slowly in software, and it seems that it is used for the sums done when calculating standard deviations with "sd". Radford Neal > Date: Wed, 8 Aug 2012 18:55:37 -0500 > From: "Eberle, Anthony" > To: > Subject: [Rd] Performance issues with R2.15.1 built from source on > Solaris? > I have a question about building R (2.15.1) from source on a Solaris 10 > 64bit virtual server with 2 cores and 16GB memory that is running on an > Oracle T4 server. Based on several tests I have done this configuration > has been several orders of magnitude slower than any other configuration > I've tested. > > A simple test of some code to calculate the standard deviation 1 > times (simple code to consume resources) takes on average 121.498 > seconds on the Solaris server where the next worst system (Redhat Linux) > takes 1.567 seconds: __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Bugs due to naive copying of list elements
Several bugs are present in R-2.15.3 and R-alpha due to naive copying of list elements. The bug below is due to naive copying in subset.c (with similar bugs for matrices and arrays): a<-list(c(1,2),c(3,4),c(5,6)) b<-a[2:3] a[[2]][2]<-9 print(b[[1]][2]) Naive copying in mapply.c leads to the following bug: X<-1+1 f<-function(a,b) X A<-mapply(f,c(1,2,3),c(4,5,6),SIMPLIFY=FALSE) print(A) X[1]<-99 print(A) Similar bugs exist in eapply, rapply, and vapply. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] call R function from C code
> From: "Matwey V. Kornilov" > > the following seems to work just great for me: > > PROTECT(sx = eval(lang3(install("solve"),sA,sb),R_BaseEnv)) You need to PROTECT the result of lang3 before calling eval. And on the other hand, you don't necessarily have to protect the result of eval (only if you will be doing further allocations while still using sx). Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel