Stavros Macrakis wrote: > On Thu, Feb 12, 2009 at 4:28 AM, Gavin Simpson <gavin.simp...@ucl.ac.uk> > wrote: > >> When I'm testing the speed of things like this (that are in and of themselves >> very quick) for situations where it may matter, I wrap the function call in >> a call >> to replicate(): >> >> system.time(replicate(1000, svd(Mean_svd_data))) >> >> to run it 1000 times, and that allows me to judge how quickly the >> function executes. >> > > I do the same, but with a small twist: > > system.time(replicate(1000, {svd(Mean_svd_data); 0} )) > > This allows the values of svd(...) to be garbage collected. > > If you don't do this and the output of the timed code is large, you > may allocate large amounts of memory (which may influence your timing > results) or run out of memory (which will also influence your timing > results :-) ), > >
to contribute my few cents, here's a simple benchmarking routine, inspired by the perl module Benchmark. it allows one to benchmark an arbitrary number of expressions with an arbitrary number of replications, and provides a summary matrix with selected timings. the code below is also available from google code [1], if anyone is interested in updates (should there be any) or contributions. benchmark = function( ..., columns=c('test', 'replications', 'user.self', 'sys.self', 'elapsed', 'user.child', 'sys.child'), replicate=100, environment=parent.frame()) { arguments = match.call()[-1] parameters = names(arguments) if (is.null(parameters)) parameters = as.character(arguments) else { indices = ! parameters %in% c('columns', 'replicate', 'environment') arguments = arguments[indices] parameters = parameters[indices] } result = cbind( test=rep(ifelse(parameters=='', as.character(arguments), parameters), each=length(replicate)), as.data.frame( do.call(rbind, lapply(arguments, function(argument) do.call(rbind, lapply(replicate, function(count) c(replications=count, system.time(replicate(count, { eval(argument, environment); NULL }))))))))) result[, columns, drop=FALSE] } it's rudimentary and not fool-proof, but might be helpful if used with care. (the nested do.call-rbind-lapply sequence can surely be simplified, but i could not resist the pattern. someone once wrote that if you need more than three (five?) levels of indentation in your code, there must be something wrong with it; presumably, he was a fortran programmer.) examples: benchmark(1:10^7) # test replications user.self sys.self elapsed user.child sys.child # 1 1:10^7 100 2.168 0 2.166 0 0 benchmark(allocation=1:10^8, replicate=10) # test replications user.self sys.self elapsed user.child sys.child # 1 allocation 10 0.98 3.073 4.05 0 0 means.rep = function(n, m) replicate(n, mean(rnorm(m))) means.pat = function(n, m) colMeans(array(rnorm(n*m), c(m, n))) (result = benchmark(replicate=c(10, 100, 1000), rep=means.rep(100, 100), pat=means.pat(100, 100), columns=c('test', 'replications', 'elapsed'))) # test replications elapsed # 1 rep 10 0.037 # 2 rep 100 0.387 # 3 rep 1000 3.840 # 4 pat 10 0.017 # 5 pat 100 0.170 # 6 pat 1000 1.731 result$elapsed/result$replications # [1] 0.003700 0.003870 0.003840 0.001700 0.001700 0.001731 with(result, t.test(elapsed/replications ~ test, paired=TRUE)) # silly, i know... manual on demand. vQ [1] http://code.google.com/p/rbenchmark/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.