Hi: On Fri, Feb 25, 2011 at 12:09 PM, Christopher R. Dolanc < crdol...@ucdavis.edu> wrote:
> I'm trying to use tapply to output means and SD or SE for my data but > seem to be limited by how many times I can subset it. Here's a snippet > of my data > > > stems353[1:10,] > Time DataSource Plot Elevation Aspect Slope Type Species > SizeClass Stems > 1 Modern Cameron 70F221 1730 ESE 20 Conifer ABCO > Class1 3 > 2 Modern Cameron 70F221 1730 ESE 20 Conifer ABMA > Class1 0 > 3 Modern Cameron 70F221 1730 ESE 20 Hardwood ACMA > Class1 0 > 4 Modern Cameron 70F221 1730 ESE 20 Hardwood AECA > Class1 0 > 5 Modern Cameron 70F221 1730 ESE 20 Hardwood ARME > Class1 0 > 6 Modern Cameron 70F221 1730 ESE 20 Conifer CADE > Class1 15 > 7 Modern Cameron 70F221 1730 ESE 20 Hardwood CELE > Class1 0 > 8 Modern Cameron 70F221 1730 ESE 20 Hardwood CONU > Class1 0 > 9 Modern Cameron 70F221 1730 ESE 20 Conifer JUCA > Class1 0 > 10 Modern Cameron 70F221 1730 ESE 20 Conifer JUOC > Class1 0 > > I'd like to see means/SD of "Stems" stratified by "Species", "Time" and > "SizeClass". I can get R to give me this for means by species: > > > tapply(stems353$Stems, stems353$Species, mean) > ABCO ABMA ACMA AECA > ARME CADE CELE > 0.7305240793 0.8569405099 0.0003541076 0.0010623229 0.0017705382 > 0.4684844193 0.0063739377 > CONU JUCA JUOC LIDE > PIAL PICO PIJE > 0.0017705382 0.0003541076 0.0959631728 0.0138101983 0.3905807365 > 1.5651558074 0.2315864023 > PILA PIMO PIMO2 PIPO > PISA POTR PSME > 0.1774079320 0.1880311615 0.0311614731 0.6735127479 0.0237252125 > 0.0506373938 0.2000708215 > QUCH QUDO QUDU QUKE > QULO QUWI Salix > 0.0474504249 0.1203966006 0.0000000000 0.2071529745 0.0003541076 > 0.0548866856 0.0003541076 > SEGI TSME > 0.0021246459 0.5017705382 > > > There are several approaches here, including the aggregate() function in base R, the doBy package or the plyr package, among others: # Requires R 2.11.0 or above: aggregate(Stems ~ Species + Time + SizeClass, data = stems353, FUN = mean) # To get more than one output per group, one can use either of the above packages: library(plyr) ddply(stems353, .(Species, Time, SizeClass), summarise, avgStems = mean(Stems), sdStems = sd(Stems)) library(doBy) f <- function(x) c(mean = mean(x), sd = sd(x)) summaryBy(Stems ~ Species + Time + SizeClass, data = stems353, FUN = f) # Another possibility is package data.table: dt <- data.table(stems353,key = 'Species, Time, SizeClass') dt[, list(avgStems = mean(Stems), sdStems = sd(Stems)), by = 'Species, Time, SizeClass'] All of this is untested, so caveat emptor. Other possibilities include package sqldf, if you are comfortable with SQL syntax, package remix or package Hmisc. In other words, R has a number of efficient ways to summarize data. HTH, Dennis > > but I really need to see each species by SizeClass and Time so that each > value would be labeled something like "ABCOSizeClass1TimeModern". > Adding 2 variables to the function doesn't seem to work > > > tapply(stems353$Stems, stems353$Species, stems353$SizeClass, > stems353$Time, mean) > Error in match.fun(FUN) : > 'stems353$SizeClass' is not a function, character or symbol > > I've already created proper subsets for each of these groups, e.g. one > subset is called "stems353ABCO1" and I can run analyses on this. But, > trying to extract means straight from those subsets doesn't seem to work > > > mean(stems353ABCO1) > [1] NA > Warning message: > In mean.default(stems353ABCO1) : > argument is not numeric or logical: returning NA > > > > Thanks, > Chris Dolanc > > -- > Christopher R. Dolanc > PhD Candidate > Ecology Graduate Group > University of California, Davis > Lab Phone: (530) 752-2644 (Barbour lab) > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.