Excellent point! Thanks. -- Bert
On Wed, Sep 19, 2012 at 12:00 PM, Berend Hasselman <b...@xs4all.nl> wrote: > > On 19-09-2012, at 20:02, Bert Gunter wrote: > >> Well, following up on this observation, which can be put under the >> heading of "Sometimes vectorization can be much slower than explicit >> loops" , I offer the following: >> >> firsti <-function(x,k) >> { >> i <- 1 >> while(x[i]<=k){i <- i+1} >> i >> } >> >>> system.time(for(i in 1:100)which(x>.99)[1]) >> user system elapsed >> 19.1 2.4 22.2 >> >>> system.time(for(i in 1:100)which.max(x>.99)) >> user system elapsed >> 30.45 6.75 37.46 >> >>> system.time(for(i in 1:100)firsti(x,.99)) >> user system elapsed >> 0.03 0.00 0.03 >> >> ## About a 500 - 1000 fold speedup ! >> >>> firsti(x,.99) >> [1] 122 >> >> It doesn't seem to scale too badly, either (whatever THAT means!): >> (Of course, the which() versions are essentially identical in timing, >> and so are omitted) >> >>> system.time(for(i in 1:100)firsti(x,.9999)) >> user system elapsed >> 2.70 0.00 2.72 >> >>> firsti(x,.9999) >> [1] 18200 >> >> Of course, at some point, the explicit looping is awful -- with k = >> .999999, the index was about 360000, and the timing test took 54 >> seconds. >> >> So I guess the point is -- as always -- that the optimal approach >> depends on the nature of the data. Prudence and robustness clearly >> demands the vectorized which() approaches if you have no information. >> But if you do know something about the data, then you can often write >> much faster tailored solutions. Which is hardly revelatory, of course. > > And compiling the firsti function can also be quite lucrative! > > firsti <- function(x,k) > { > i <- 1 > while(x[i]<=k){i <- i+1} > i > } > > library(compiler) > firsti.c <- cmpfun(firsti) > >> firsti(x,.99) > [1] 93 >> firsti.c(x,.99) > [1] 93 > >> system.time(for(i in 1:100)firsti(x,.99)) > user system elapsed > 0.014 0.000 0.013 >> system.time(for(i in 1:100)firsti.c(x,.99)) > user system elapsed > 0.002 0.000 0.002 > >> system.time(for(i in 1:100)firsti(x,.9999)) > user system elapsed > 2.148 0.013 2.164 >> system.time(for(i in 1:100)firsti.c(x,.9999)) > user system elapsed > 0.393 0.002 0.396 > > And in a new run (without the above tests) with k=.999999 the index was > 1089653 and the timing for the uncompiled function was 152 seconds and the > timing for the compiled function was 28.8 seconds! > > Berend > -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.