Actually the second aggregate and second rowsum don't need the na.rm = TRUE so we only need:
aggregate(!is.na(m[, -(1:2)]), m[1], sum) rowsum(0+!is.na(m[, -(1:2)]), m[,1]) You might also want to look at summaryBy in the doBy package. On Sun, Dec 7, 2008 at 7:43 AM, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > Try > > aggregate(m[, -(1:2)], m[1], sum, na.rm = TRUE) > aggregate(!is.na(m[, -(1:2)]), m[1], sum, na.rm = TRUE) > > # or (this uses row names rather than a column for the group): > > rowsum(m[, -(1:2)], m[,1], na.rm = TRUE) > rowsum(0+!is.na(m[, -(1:2)]), m[,1], na.rm = TRUE) > > > On Sun, Dec 7, 2008 at 7:06 AM, Daren Tan <[EMAIL PROTECTED]> wrote: >> >> The aggregate function does "almost" all that I need to summarize a >> datasets, except that I can't specify exclusion of NAs without a little bit >> of hassle. >> >>> set.seed(143) >>> m <- data.frame(A=sample(LETTERS[1:5], 20, T), B=sample(LETTERS[1:10], 20, >>> T), C=sample(c(NA, 1:4), 20, T), D=sample(c(NA,1:4), 20, T)) >>> m >> A B C D >> 1 E I 1 NA >> 2 A C NA NA >> 3 D I NA 3 >> 4 C I 2 4 >> 5 A C 3 2 >> 6 E J 1 2 >> 7 D J 2 2 >> 8 C G 4 1 >> 9 C D NA 3 >> 10 B G 3 NA >> 11 C B 4 2 >> 12 A B NA NA >> 13 E A NA 4 >> 14 B B 3 3 >> 15 E I 4 1 >> 16 E J 3 1 >> 17 B J 4 4 >> 18 B J 1 3 >> 19 D D 4 2 >> 20 B B 4 3 >> >>> aggregate(m[,-c(1:2)], by=list(m[,1]), sum) >> Group.1 C D >> 1 A NA NA >> 2 B 15 NA >> 3 C NA 10 >> 4 D NA 7 >> 5 E NA NA >> >>> aggregate(m[,-c(1:2)], by=list(m[,1]), length) >> Group.1 C D >> 1 A 3 3 >> 2 B 5 5 >> 3 C 4 4 >> 4 D 3 3 >> 5 E 5 5 >> >> My own defined version of length and sum to exclude NA >> >>> mylength <- function(x) { sum(as.logical(x), na.rm=T) } >>> mysum <- function(x) {sum(x, na.rm=T)} >> >>> aggregate(m[,-c(1:2)], by=list(m[,1]), mysum) <----------------- this >>> computes correctly. >> Group.1 C D >> 1 A 3 2 >> 2 B 15 13 >> 3 C 10 10 >> 4 D 6 7 >> 5 E 9 8 >> >>> aggregate(m[,-c(1:2)], by=list(m[,1]), mylength) <----------------- this >>> computes correctly. >> Group.1 C D >> 1 A 1 1 >> 2 B 5 4 >> 3 C 3 4 >> 4 D 2 3 >> 5 E 4 4 >> >> There are other statistics I need to compute e.g. var, sd, and it is a >> hassle to create customized versions to exclude NA. Any alternative >> approaches ? >> >> >> >> >> _________________________________________________________________ >> [[elided Hotmail spam]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.