Will this do? library(plyr) ddply(my_df, .(a), summarize, mm = mean(dat), number = length(dat))
John Kane Kingston ON Canada > -----Original Message----- > From: ashen...@ufl.edu > Sent: Wed, 20 Mar 2013 14:57:36 -0500 > To: r-help@r-project.org > Subject: [R] summarize dataframe based on multiple cols, not their > combinations > > Hi folks, > > I'm trying to figure out how to get summarized data based on multiple > columns. However, instead of giving summaries for every combination of > categorical columns, I want it for each value of each categorical column > regardless of the other columns. I could do this with three different > commands, but i'm wondering if there's a more elegant way that I'm > missing. Thanks! > > allie > >> my_df = data.frame(a = c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), > c=c(1,0,1,0,1,0), dat=c(10,11,12,13,14,15)) > >> my_df > a b c dat > 1 1 0 1 10 > 2 1 0 0 11 > 3 1 0 1 12 > 4 0 1 0 13 > 5 0 1 1 14 > 6 0 1 0 15 > >> # not what I want >> ddply(my_df, .(a,b,c), function(x) c("mean"=mean(x$dat), "n"=nrow(x))) > a b c mean n > 1 0 1 0 14 2 > 2 0 1 1 14 1 > 3 1 0 0 11 1 > 4 1 0 1 11 2 > > What I want: > a b c mean n > 1 1 * * 11 3 > 2 * 1 * 14 3 > 3 * * 1 12 3 > > where "*" refers to any value of the other columns. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ____________________________________________________________ FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.