Dear List, I'm an [R] novice starting analysis of an ecological dataset containing the basal areas of different tree species in a number of research plots. Example data follow:
> Trees<-data.frame(SppID=as.factor(c(rep('QUEELL',2), rep('QUEALB',3), 'CORAME', 'ACENEG', 'TILAME')), BA=c(907.9, 1104.4, 113.0, 143.1, 452.3, 638.7, 791.7, 804.3), PlotID=as.factor(c('BU3F10', rep('BU3F11',2), rep('BU3F12',5)))) > Trees SppID BA PlotID 1 QUEELL 907.9 BU3F10 2 QUEELL 1104.4 BU3F11 3 QUEALB 113.0 BU3F11 4 QUEALB 143.1 BU3F12 5 QUEALB 452.3 BU3F12 6 CORAME 638.7 BU3F12 7 ACENEG 791.7 BU3F12 8 TILAME 804.3 BU3F12 Fields are (in order): Tree Species Code, Basal Area, and Plot Code. I've been successful in computing summary statistics by species or plot groups using tapply(): > tapply(BA, PlotID, sum) BU3F10 BU3F11 BU3F12 907.9 1217.4 2830.1 *My Question* I'd like to perform a similar function that tells me how many species are in each plot, I thought this would be possible using something like: > tapply(SppID, PlotID, nlevels) BU3F10 BU3F11 BU3F12 5 5 5 however, this outputs the total number of levels for the factor SppID rather than the number of species in each plot category which would look like: BU3F10 BU3F11 BU3F12 1 2 4 I understand, from reading the archive, that this occurs because R does not subset factor levels, but I'm wondering if there's a simple way around this. Thanks for your help, Ian Chidister Environment and Resources The Nelson Institute for Environmental Studies University of Wisconsin-Madison, USA [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.