This is a nice demonstration of the formula interface to aggregate. A less elegant alternative is to pass lists as arguments.
with(dd, aggregate(Correct, by = list(Subject = Subject, Group = Group), FUN = function(x) sum(x == 'C'))) Using a list is advantageous if you want to make the summary of more than one variable (which does not seem to be the case, here) --- I believe that the formula interface doesn't allow for that. That would be set up like this with(dd, aggregate(x = list(Correct = Correct, other target variables listed here, ...), by = list(Subject = Subject, Group = Group), FUN = function(x) sum(x == 'C'))) Cheers Andrew On Sat, Apr 30, 2011 at 10:03:24PM -0700, Dennis Murphy wrote: > Hi: > > If you have R 2.11.x or later, one can use the formula version of aggregate(): > > aggregate(Correct ~ Subject + Group, data = ALLDATA, FUN = function(x) > sum(x == 'C')) > > A variety of contributed packages (plyr, data.table, doBy, sqldf and > remix, among others) have similar capabilities. > > If you want some additional summaries (e.g., percent correct), here is > an example function for a single subject/group that aggregate() can > use to propagate to all subgroups and subjects (I encourage you to > play with it): > > f <- function(x) { > Correct <- sum(x == 'C') > Percent <- round(100 * Correct/length(x), 3) > c(Number = Correct, Percent = Percent) > } > aggregate(Correct ~ Subject + Group, data = ALLDATA, FUN = f) > > The particular function isn't as important as knowing you can do this > sort of thing. Several of the contributed packages indicated above > have similar, if not superior, capabilities, depending on the > situation. > > Toy example to test the above: > > dd <- data.frame(Subject = rep(1:5, each = 100), > Group = rep(rep(c('C', 'T'), each = 50), 5), > Correct = factor(rbinom(500, 1, 0.8), labels = c('I', 'C'))) > aggregate(Correct ~ Subject + Group, data = dd, FUN = function(x) sum(x == > 'C')) > Subject Group Correct > 1 1 C 40 > 2 2 C 36 > 3 3 C 39 > 4 4 C 37 > 5 5 C 41 > 6 1 T 43 > 7 2 T 45 > 8 3 T 37 > 9 4 T 45 > 10 5 T 36 > aggregate(Correct ~ Subject + Group, data = dd, FUN = f) > Subject Group Correct.Number Correct.Percent > 1 1 C 40 80 > 2 2 C 36 72 > 3 3 C 39 78 > 4 4 C 37 74 > 5 5 C 41 82 > 6 1 T 43 86 > 7 2 T 45 90 > 8 3 T 37 74 > 9 4 T 45 90 > 10 5 T 36 72 > > HTH, > Dennis > > On Sat, Apr 30, 2011 at 12:28 PM, Kevin Burnham <kburn...@gmail.com> wrote: > > HI All, > > > > I have a long data file generated from a minimal pair test that I gave to > > learners of Arabic before and after a phonetic training regime. For each of > > thirty some subjects there are 800 rows of data, from each of 400 items at > > pre and posttest. For each item the subject got correct, there is a 'C' in > > the column 'Correct'. The line: > > > > tapply(ALLDATA$Correct, ALLDATA$Subject, function(x)sum(x=="C")) > > > > gives me the sum of correct answers for each subject. > > > > However, I would like to have that sum separated by Time (pre or post). Is > > there a simple way to do that? > > > > > > What if I further wish to separate by Group (T or C)? > > > > Thanks, > > Kevin > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Andrew Robinson Program Manager, ACERA Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/ Forest Analytics with R (Springer, 2011) http://www.ms.unimelb.edu.au/FAwR/ Introduction to Scientific Programming and Simulation using R (CRC, 2009): http://www.ms.unimelb.edu.au/spuRs/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.