Hi Francisco, Thanks for your solution. It runs pretty fast compared to my for loop. Here is a comparison of system.time():
system.time(splitVals <- by(serv, dates, aggregateDf )) user system elapsed 1.129 0.218 1.348 system.time(... my long for loop...) user system elapsed 276.987 1.544 278.698 I also tried Davids solution with "aggregate", but I can't get it to work because I have to add as.numeric() into the sum(), since the data is very big. I will now try to understand how the by()-function works and what it does. Thanks again for helping me! Regards, Benjamin On Thu, Mar 10, 2011 at 04:26:57PM +0000, Francisco Gochez wrote: > Benjamin, > > A more elegant "R-style" solution would be to use one of R's "apply"/ > aggregation routines, of which there are many. For example, the "by" function > can split a data.frame by some factor/categorical variable(s), and then apply > a > function to each "slice". The result can then be pieced back together. See > below for an example in which this factor is simply a parallel vector of pure > dates: > > # extract pure date component of time and date > dates <- format(serv$datum, "%Y-%m-%d") > > # write auxilliary function to aggregate a "slice" of the data.frame > # x will be a "slice" of data from a single day > aggregateDf <- function(x) > { > # return a one-row data.frame > data.frame(datum = format(x$datum[1], "%Y-%m-%d"), write = sum(x$write), > read = sum(x$read) ) > } > > # now process each "slice" of the serv data.frame using "by" > splitVals <- by(serv, dates, aggregateDf ) > > # bind back into a single data.frame > values <- do.call(rbind, splitVals) > > > The difference in execution speed is pretty negligible on my machine, so it's > a > more concise solution but I don't know if it is much faster. > > HTH, > > Francisco ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.