try 'reshape': > require(reshape) > # add a column to accumulate on > tmp$inc <- 1 > recast(tmp, f1 + f2 + f3 ~ ., sum) Using f1, f2, f3 as id variables f1 f2 f3 (all) 1 Male White 0-20 3 2 Male White 21-40 4 3 Male White 41-60 2 4 Male White 61-80 3 5 Male Black 0-20 3 6 Male Black 21-40 4 7 Male Black 41-60 2 8 Male Black 61-80 3 9 Male Hispanic 0-20 4 10 Male Hispanic 21-40 4 11 Male Hispanic 41-60 4 12 Male Hispanic 61-80 3 13 Male Other 0-20 3 14 Male Other 21-40 2 15 Male Other 41-60 2 16 Male Other 61-80 4 17 Female White 0-20 2 18 Female White 21-40 4 19 Female White 41-60 4 20 Female White 61-80 3 21 Female Black 0-20 5 22 Female Black 21-40 3 23 Female Black 41-60 4 24 Female Black 61-80 1 25 Female Hispanic 0-20 1 26 Female Hispanic 21-40 2 27 Female Hispanic 41-60 4 28 Female Hispanic 61-80 3 29 Female Other 0-20 4 30 Female Other 21-40 2 31 Female Other 41-60 3 32 Female Other 61-80 5 > >
On Fri, Oct 2, 2009 at 2:15 PM, Andrew Spence <aspe...@rvc.ac.uk> wrote: > Dear R-help, > > > > First of all, thank you VERY much for any help you have time to offer. I > greatly appreciate it. > > > > I would like to write a function that, given an arbitrary number of factors > from a data frame, tabulates the number of occurrences of each unique > combination of the factors. Cleary, this works: > > > >> table(horse,date,surface) > > <SNIP> > > , , surface = TURF > > > > date > > horse 20080404 20080514 20081015 20081025 20081120 20081203 > 20090319 > > Bedevil 0 0 0 0 0 0 > 0 > > Cut To The Point 227 0 0 0 0 0 > 0 > > <SNIP> > > > > But I would prefer output that skips all the zeros, flattens any dimensions > greater than 2, and gives the level names rather than codes. I can write > code specifically for n factors like this: (here 2 levels): > > > > ft <- function(x,y) {cbind( > levels(x)[unique(cbind(x,y))[,1]],levels(y)[unique(cbind(x,y))[,2]], > table(x,y)[unique(cbind(x,y))])} > > > > which gives the lovely output I'm looking for: > > > > # [,1] [,2] [,3] > > # [1,] "Cut To The Point" "20080404" "227" > > # [2,] "Prairie Wolf" "20080404" "364" > > # [3,] "Bedevil" "20080514" "319" > > # [4,] "Prairie Wolf" "20080514" "330" > > > > But my attempts to make this into a function that handles arbitrary numbers > of factors as separate input arguments has failed. The closest I can get is: > > > > ft2 <- function (...) { cbind( unique(cbind(...)), > table(...)[unique(cbind(...))] ) > > > > giving: > >> ft2(horse,date) > > horse date > > [1,] 2 1 227 > > [2,] 9 1 364 > > [3,] 1 2 319 > > [4,] 9 2 330 > > [5,] 9 3 291 > > [6,] 12 3 249 > > [7,] 10 3 286 > > [8,] 5 4 217 > > [9,] 3 4 426 > > [10,] 8 4 468 > > [11,] 9 5 319 > > [12,] 13 5 328 > > [13,] 12 5 138 > > [14,] 7 6 375 > > [15,] 11 6 366 > > [16,] 4 7 255 > > [17,] 6 7 517 > > > > I would be greatly in debt to anyone willing to show me how to make the > above function take arbitrary inputs and still produce output displaying > factor level names instead of the underlying coded numbers. > > > > Cheers and thanks for your time! > > > > Andrew Spence > RCUK Academic Research Fellow > Structure and Motion Laboratory > Royal Veterinary College > Hawkshead Lane > North Mymms, Hatfield > Hertfordshire AL9 7TA > +44 (0) 1707 666988 > > mailto:aspe...@rvc.ac.uk > > http://www.rvc.ac.uk/sml/People/andrewspence.cfm > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.