Hi, One liners in data.table are :
> x.dt[,lapply(.SD,mean),by=sample] sample replicate height weight age [1,] A 2.0 12.20000 0.5033333 6.000000 [2,] B 1.5 12.75000 0.7150000 4.500000 [3,] C 2.5 11.35250 0.5125000 3.750000 [4,] D 2.0 14.99333 0.6733333 5.333333 without the replicate column : > x.dt[,lapply(list(height,weight,age),mean),by=sample] sample V1 V2 V3 [1,] A 12.20000 0.5033333 6.000000 [2,] B 12.75000 0.7150000 4.500000 [3,] C 11.35250 0.5125000 3.750000 [4,] D 14.99333 0.6733333 5.333333 one (long) way to retain the column names : > x.dt[,lapply(list(height=height,weight=weight,age=age),mean),by=sample] sample height weight age [1,] A 12.20000 0.5033333 6.000000 [2,] B 12.75000 0.7150000 4.500000 [3,] C 11.35250 0.5125000 3.750000 [4,] D 14.99333 0.6733333 5.333333 > or this is shorter : > ans = x.dt[,lapply(.SD,mean),by=sample] > ans$replicate = NULL > ans sample height weight age [1,] A 12.20000 0.5033333 6.000000 [2,] B 12.75000 0.7150000 4.500000 [3,] C 11.35250 0.5125000 3.750000 [4,] D 14.99333 0.6733333 5.333333 > or another way : > mycols = c("height","weight","age") > x.dt[,lapply(.SD[,mycols,with=FALSE],mean),by=sample] sample height weight age [1,] A 12.20000 0.5033333 6.000000 [2,] B 12.75000 0.7150000 4.500000 [3,] C 11.35250 0.5125000 3.750000 [4,] D 14.99333 0.6733333 5.333333 > or another way : > x.dt[,lapply(.SD[,list(height,weight,age)],mean),by=sample] sample height weight age [1,] A 12.20000 0.5033333 6.000000 [2,] B 12.75000 0.7150000 4.500000 [3,] C 11.35250 0.5125000 3.750000 [4,] D 14.99333 0.6733333 5.333333 > The way Jim showed : > x.dt[, list(height = mean(height) + , weight = mean(weight) + , age = mean(age) + ), by = sample] is the more flexible syntax for when you want different functions on different columns, easily, and as a bonus is fast. Matthew "Dennis Murphy" <djmu...@gmail.com> wrote in message news:AANLkTimxXL8BqTaYKUb=saee2cra9fosfuap4qzkx...@mail.gmail.com... > Hi: > > Here are a few one-liners. Calling your data frame dd, > > aggregate(cbind(height, weight, age) ~ sample, data = dd, FUN = mean) > sample height weight age > 1 A 12.20000 0.5033333 6.000000 > 2 B 12.75000 0.7150000 4.500000 > 3 C 11.35250 0.5125000 3.750000 > 4 D 14.99333 0.6733333 5.333333 > > With package doBy: > > library(doBy) > summaryBy(height + weight + age ~ sample, data = dd, FUN = mean) > sample height.mean weight.mean age.mean > 1 A 12.20000 0.5033333 6.000000 > 2 B 12.75000 0.7150000 4.500000 > 3 C 11.35250 0.5125000 3.750000 > 4 D 14.99333 0.6733333 5.333333 > > With package plyr: > > library(plyr) > ddply(dd, .(sample), colwise(mean, .(height, weight, age))) > sample height weight age > 1 A 12.20000 0.5033333 6.000000 > 2 B 12.75000 0.7150000 4.500000 > 3 C 11.35250 0.5125000 3.750000 > 4 D 14.99333 0.6733333 5.333333 > > Dennis > > On Fri, Mar 11, 2011 at 1:32 AM, Aline Santos <aline...@gmail.com> wrote: > >> Hello R-helpers: >> >> I have data like this: >> >> sample replicate height weight age >> A 1.00 12.0 0.64 6.00 >> A 2.00 12.2 0.38 6.00 >> A 3.00 12.4 0.49 6.00 >> B 1.00 12.7 0.65 4.00 >> B 2.00 12.8 0.78 5.00 >> C 1.00 11.9 0.45 6.00 >> C 2.00 11.84 0.44 2.00 >> C 3.00 11.43 0.32 3.00 >> C 4.00 10.24 0.84 4.00 >> D 1.00 14.2 0.54 2.00 >> D 2.00 15.67 0.67 7.00 >> D 3.00 15.11 0.81 7.00 >> >> Now, how can I calculate the mean for each condition (heigth, weigth, >> age) >> in each sample, considering the samples have different number of >> replicates? >> >> >> The final matrix should look like: >> >> sample height weight age >> A 12.20 0.50 6.00 >> B 12.75 0.72 4.50 >> C 11.35 0.51 3.75 >> D 14.99 0.67 5.33 >> >> This is a simplified version of my dataset, which consist of 100 samples >> (unequally distributed in 530 replicates) for 600 different conditions. >> >> I appreciate all the help. >> >> A.S. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.