On 19-09-2012, at 20:02, Bert Gunter wrote: > Well, following up on this observation, which can be put under the > heading of "Sometimes vectorization can be much slower than explicit > loops" , I offer the following: > > firsti <-function(x,k) > { > i <- 1 > while(x[i]<=k){i <- i+1} > i > } > >> system.time(for(i in 1:100)which(x>.99)[1]) > user system elapsed > 19.1 2.4 22.2 > >> system.time(for(i in 1:100)which.max(x>.99)) > user system elapsed > 30.45 6.75 37.46 > >> system.time(for(i in 1:100)firsti(x,.99)) > user system elapsed > 0.03 0.00 0.03 > > ## About a 500 - 1000 fold speedup ! > >> firsti(x,.99) > [1] 122 > > It doesn't seem to scale too badly, either (whatever THAT means!): > (Of course, the which() versions are essentially identical in timing, > and so are omitted) > >> system.time(for(i in 1:100)firsti(x,.9999)) > user system elapsed > 2.70 0.00 2.72 > >> firsti(x,.9999) > [1] 18200 > > Of course, at some point, the explicit looping is awful -- with k = > .999999, the index was about 360000, and the timing test took 54 > seconds. > > So I guess the point is -- as always -- that the optimal approach > depends on the nature of the data. Prudence and robustness clearly > demands the vectorized which() approaches if you have no information. > But if you do know something about the data, then you can often write > much faster tailored solutions. Which is hardly revelatory, of course.
And compiling the firsti function can also be quite lucrative! firsti <- function(x,k) { i <- 1 while(x[i]<=k){i <- i+1} i } library(compiler) firsti.c <- cmpfun(firsti) > firsti(x,.99) [1] 93 > firsti.c(x,.99) [1] 93 > system.time(for(i in 1:100)firsti(x,.99)) user system elapsed 0.014 0.000 0.013 > system.time(for(i in 1:100)firsti.c(x,.99)) user system elapsed 0.002 0.000 0.002 > system.time(for(i in 1:100)firsti(x,.9999)) user system elapsed 2.148 0.013 2.164 > system.time(for(i in 1:100)firsti.c(x,.9999)) user system elapsed 0.393 0.002 0.396 And in a new run (without the above tests) with k=.999999 the index was 1089653 and the timing for the uncompiled function was 152 seconds and the timing for the compiled function was 28.8 seconds! Berend ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.