Re: [R] means, SD's and tapply

Dennis Murphy Fri, 25 Feb 2011 13:06:18 -0800

Hi:

On Fri, Feb 25, 2011 at 12:09 PM, Christopher R. Dolanc <
crdol...@ucdavis.edu> wrote:


> I'm trying to use tapply to output means and SD or SE for my data but
> seem to be limited by how many times I can subset it.  Here's a snippet
> of my data
>
>  > stems353[1:10,]
>      Time DataSource   Plot Elevation Aspect Slope     Type Species
> SizeClass Stems
> 1  Modern    Cameron 70F221      1730    ESE    20  Conifer    ABCO
> Class1     3
> 2  Modern    Cameron 70F221      1730    ESE    20  Conifer    ABMA
> Class1     0
> 3  Modern    Cameron 70F221      1730    ESE    20 Hardwood    ACMA
> Class1     0
> 4  Modern    Cameron 70F221      1730    ESE    20 Hardwood    AECA
> Class1     0
> 5  Modern    Cameron 70F221      1730    ESE    20 Hardwood    ARME
> Class1     0
> 6  Modern    Cameron 70F221      1730    ESE    20  Conifer    CADE
> Class1    15
> 7  Modern    Cameron 70F221      1730    ESE    20 Hardwood    CELE
> Class1     0
> 8  Modern    Cameron 70F221      1730    ESE    20 Hardwood    CONU
> Class1     0
> 9  Modern    Cameron 70F221      1730    ESE    20  Conifer    JUCA
> Class1     0
> 10 Modern    Cameron 70F221      1730    ESE    20  Conifer    JUOC
> Class1     0
>
> I'd like to see means/SD of "Stems" stratified by "Species", "Time" and
> "SizeClass".  I can get R to give me this for means by species:
>
>  > tapply(stems353$Stems, stems353$Species, mean)
>         ABCO         ABMA         ACMA         AECA
> ARME         CADE         CELE
> 0.7305240793 0.8569405099 0.0003541076 0.0010623229 0.0017705382
> 0.4684844193 0.0063739377
>         CONU         JUCA         JUOC         LIDE
> PIAL         PICO         PIJE
> 0.0017705382 0.0003541076 0.0959631728 0.0138101983 0.3905807365
> 1.5651558074 0.2315864023
>         PILA         PIMO        PIMO2         PIPO
> PISA         POTR         PSME
> 0.1774079320 0.1880311615 0.0311614731 0.6735127479 0.0237252125
> 0.0506373938 0.2000708215
>         QUCH         QUDO         QUDU         QUKE
> QULO         QUWI        Salix
> 0.0474504249 0.1203966006 0.0000000000 0.2071529745 0.0003541076
> 0.0548866856 0.0003541076
>         SEGI         TSME
> 0.0021246459 0.5017705382
>  >
>

There are several approaches here, including the aggregate() function in
base R, the doBy package or the plyr package, among others:

# Requires R 2.11.0 or above:
aggregate(Stems ~ Species + Time + SizeClass, data = stems353, FUN = mean)

# To get more than one output per group, one can use either of the above
packages:

library(plyr)
ddply(stems353, .(Species, Time, SizeClass), summarise, avgStems =
mean(Stems), sdStems = sd(Stems))

library(doBy)
f <- function(x) c(mean = mean(x), sd = sd(x))
summaryBy(Stems ~ Species + Time + SizeClass, data = stems353, FUN = f)

# Another possibility is package data.table:
dt <- data.table(stems353,key = 'Species, Time, SizeClass')
dt[, list(avgStems = mean(Stems), sdStems = sd(Stems)), by = 'Species, Time,
SizeClass']

All of this is untested, so caveat emptor. Other possibilities include
package sqldf, if you are comfortable with SQL syntax, package remix or
package Hmisc. In other words, R has a number of efficient ways to summarize
data.

HTH,
Dennis

>
> but I really need to see each species by SizeClass and Time so that each
> value would be labeled something like "ABCOSizeClass1TimeModern".
> Adding 2 variables to the function doesn't seem to work
>
>  > tapply(stems353$Stems, stems353$Species, stems353$SizeClass,
> stems353$Time, mean)
> Error in match.fun(FUN) :
>   'stems353$SizeClass' is not a function, character or symbol
>
> I've already created proper subsets for each of these groups, e.g. one
> subset is called "stems353ABCO1" and I can run analyses on this.  But,
> trying to extract means straight from those subsets doesn't seem to work
>
>  > mean(stems353ABCO1)
> [1] NA
> Warning message:
> In mean.default(stems353ABCO1) :
>   argument is not numeric or logical: returning NA
>  >
>
> Thanks,
> Chris Dolanc
>
> --
> Christopher R. Dolanc
> PhD Candidate
> Ecology Graduate Group
> University of California, Davis
> Lab Phone: (530) 752-2644 (Barbour lab)
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] means, SD's and tapply

Reply via email to