On Thu, Jul 15, 2010 at 10:45 PM, Murat Tasan <mmu...@gmail.com> wrote: > hi all - i'm just wondering what sort of code people write to > essentially performa an aggregate call, but with different functions > being applied to the various columns. > > for example, if i have a data frame x and would like to marginalize by > a factor f for the rows, but apply mean() to col1 and median() to > col2. > > if i wanted to apply mean() to both columns, i would call: > > aggregate(x, list(f), mean) > > but to get the mean of col1 and the median of col2, i have to write > separate tapply calls, then wrap back into a data frame: > > data.frame(tapply(x$col1, f, mean), tapply(x$col2, f, mean)) > > this is a somewhat inelegant solution for data frames with potentially > many columns. > > what i would like is for aggregate to take a list of functions for > columns, something like: > > aggregate(x, list(f), list(mean, median)) > > > i'm just curious how others get around this limitation in aggregate(). > do most simply make the individual tapply() calls separately, then > possibly wrap them back up (as done in the example above), or is there > a more elegant solution using some function of R that i might be > unaware of? >
Using sqldf we can write: > library(sqldf) > sqldf("select Treatment, avg(conc), median(uptake) from CO2 group by > Treatment") Treatment avg(conc) median(uptake) 1 chilled 435 19.7 2 nonchilled 435 31.3 See http://sqldf.googlecode.com for more info. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.