> Hi Tom, > > > Now, try sorting and using a loop: > > > >> idx <- order(i) > >> xs <- x[idx] > >> is <- i[idx] > >> res <- array(NA, 1e6) > >> idx <- which(diff(is) > 0) > >> startidx <- c(1, idx+1) > >> endidx <- c(idx, length(xs)) > >> f1 <- function(x, startidx, endidx, FUN = sum) { > > + for (j in 1:length(res)) { > > + res[j] <- FUN(x[startidx[j]:endidx[j]]) > > + } > > + res > > + } > >> unix.time(res1 <- f1(xs, startidx, endidx)) > > [1] 6.86 0.00 7.04 NA NA > > I wonder how much time the sorting, reordering and creation os > startidx and endidx would add to this time?
Done interactively, sorting and indexing seemed fast. Here are some timings: > unix.time({idx <- order(i) + xs <- x[idx] + is <- i[idx] + res <- array(NA, 1e6) + idx <- which(diff(is) > 0) + startidx <- c(1, idx+1) + endidx <- c(idx, length(xs)) + }) [1] 1.06 0.00 1.09 NA NA > That looks interesting. Does it only work for specific operating > systems and processors? I will give it a try. No, as far as I know, it works on all operating systems. Also, it gets a little faster if you directly put the sum in the function: > f4 <- function(x, startidx, endidx) { + for (j in 1:length(res)) { + res[j] <- sum(x[startidx[j]:endidx[j]]) + } + res + } > f5 <- cmpfun(f4) > unix.time(res5 <- f5(xs, startidx, endidx)) [1] 2.67 0.03 2.95 NA NA - Tom -- View this message in context: http://www.nabble.com/Any-interest-in-%22merge%22-and-%22by%22-implementations-specifically-for-sorted-data--tf2009595.html#a5578580 Sent from the R devel forum at Nabble.com. ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel