Hi folks, I'm attempting to use the EMD package to analyze some neuroimaging data (timeseries with 64 channels sampled across 1 million time points within each of 20 people). I found that processing a single channel of data using EMD::emd() took about 8 hours. Exploration using Rprof() suggested that most of the compute time was spent in EMD::extrema(). Looking at the code for EMD:extrema(), I managed to find one obvious speedup (switching from employing rbind() to c()) and I suspect that there may be a way to further speed things up by pre-allocating all the objects that are currently being created with c(), but I'm having trouble understanding the code sufficiently to know when/where to try this and what sizes to set as the default pre-allocation length. Below I include code that demonstrates the speedup I achieved by eliminating calls to rbind(), and also demonstrates that only a few calls to c() seem to be responsible for most of the compute time. The files "extrema_c.R" and "extrema_c2.R" are available at: https://gist.github.com/822691
Any suggestions/help would be greatly appreciated. #load the EMD library for the default version of extrema library(EMD) #some data to process values = rnorm(1e4) #profile the default version of extrema Rprof(tmp <- tempfile()) temp = extrema(values) Rprof() summaryRprof(tmp) #1.2s total with most time spend doing rbind unlink(tmp) #load a rbind-free version of extrema source('extrema_c.R') Rprof(tmp <- tempfile()) temp = extrema_c(values) Rprof() summaryRprof(tmp) #much faster! .5s total unlink(tmp) #still, it encounters slowdowns with lots of data values = rnorm(1e5) Rprof(tmp <- tempfile()) temp = extrema_c(values) Rprof() summaryRprof(tmp) #44s total, hard to see what's taking up so much time unlink(tmp) #load an rbind-free version of extrema that labels each call to c() source('extrema_c2.R') Rprof(tmp <- tempfile()) temp = extrema_c2(values) Rprof() summaryRprof(tmp) #same time as above, but now we see that it spends more time in certain calls to c() than others unlink(tmp) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.