[Rd] small performance enhancement suggestion (PR#10409)
Full_Name: Simon de Bernard Version: OS: Submission from: (NULL) (140.77.34.213) In src/main/dotcode.c, function RObjToCPtr, test for naok should be transferred outside of the loops... __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] "by" speed improvement (PR#11064)
Full_Name: Simon de Bernard Version: 2.7.0 (44733) OS: MacOS Submission from: (NULL) (140.77.34.213) "by" usually takes forever even on a "not so large" data structure. If one can do with a matrix instead of a data.frame, defining by.matrix as by.data.frame modified to convert data back to a data.frame in the "eval" call improves speed by orders of magnitude (whatever that means ;-) ) I'd suggest defining the by.matrix function directly in R... __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] (PR#11064) how to reproduce...
You can try this: data <- cbind("a"=sample(1:10), "b"=sample(1:10)) fact <- sample(rep(1:1, each=10)) system.time(std <- by(data, fact, colSums)) by.matrix <- function (data, INDICES, FUN, ...) { if (!is.list(INDICES)) { IND <- vector("list", 1) IND[[1]] <- INDICES names(IND) <- deparse(substitute(INDICES))[1] } else IND <- INDICES FUNx <- function(x) FUN(data[x, , drop = FALSE], ...) nd <- nrow(data) ans <- eval(substitute(tapply(1:nd, IND, FUNx)), as.data.frame (data)) attr(ans, "call") <- match.call() class(ans) <- "by" ans } system.time(mod <- by(data, fact, colSums)) all.equal(std, mod) I get a 30x speed up (I'm not sure why the attributes differ, but I'm sure this can be fixed...) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel