Hi:

Henrique's solution is elegant, but if you want to summarize certain
features of the test (e.g., the value of the test statistic and its
p-value), then here's a different approach using packages reshape and plyr.

# Since your data in group C had a sample size of 2, I redid the data frame
using more data.

m <- matrix(rnorm(288), nrow = 36)
colnames(m) <- paste('V', 1:8, sep = '')
x <- data.frame(site = factor(rep(c('A', 'B', 'C'), each = 12)),
                status = factor(rep(rep(c('D','L'), each = 6), 3)),
                as.data.frame(m))

# This little trick stacks V1-V8 into a vector called value,
# with an accompanying factor called variable.

library(reshape)      # melt is a function in the reshape package
xm <- melt(x, id = c('site', 'status'))

# xm has four variables: site, status, variable and value.

# We now write a function that does the t-test and outputs
# the value of the test statistic and the (two-sided) p-value.
# To modify the arguments of the t.test call, modify the function
# f accordingly. Ditto if you want to change the outputs.

library(plyr)   # ddply below is a function from this package

f <- function(df) {
        u <- t.test(value ~ status, data = df)
        list(tstat = u$statistic, pval = u$p.value)
       }

# The function is applied to all site/variable combinations
# as.data.frame.function allows the output to be returned
# as variables in a data frame

u <- ddply(xm, .(site, variable), as.data.frame.function(f))
u
   site variable  value.tstat value.pval
1     A       V1 -2.36244305 0.04019757
2     A       V2  0.35853212 0.73105571
3     A       V3 -0.29033960 0.77796762
4     A       V4 -0.39977559 0.69789482
5     A       V5  0.73992896 0.47737988
6     A       V6  2.41243447 0.03823083
7     A       V7  0.37406273 0.71792150
8     A       V8 -0.58363656 0.57388079
9     B       V1  2.03180350 0.06968520
10    B       V2 -0.63778310 0.53794510
11    B       V3  1.66999237 0.12881606
12    B       V4  0.89302839 0.39492211
13    B       V5 -1.42946866 0.18349366
14    B       V6 -0.52158791 0.61836960
15    B       V7  1.44180092 0.18123210
16    B       V8  0.50992197 0.62359868
17    C       V1  1.12246634 0.29033521
18    C       V2  1.06388885 0.31587500
19    C       V3  0.32000364 0.75599890
20    C       V4  0.95363381 0.36327043
21    C       V5 -1.19511893 0.26058768
22    C       V6  1.10885666 0.29526230
23    C       V7 -0.08869988 0.93128143
24    C       V8  2.85254620 0.01892610

HTH,
Dennis


On Sat, Aug 21, 2010 at 7:15 AM, Alison Macalady <a...@kmhome.org> wrote:

> I have a data.frame with ~250 observations (rows) in each of ~50 categories
> (columns).  I would like to perform t.tests on subsets of observations
> within each column, with the subsets according to index vectors contained in
> other columns of the data.frame.
>
> My data.frame looks something like this:
>
> x<-data.frame(matrix(rnorm(200,mean=5,sd=.5),nrow=20))
> colnames(x)<-c("site", "status", "X1", "X2", "X3", "X4", "X5", "X6", "X7",
> "X8")
> x$site<-as.factor(rep(c("A", "A", "B", "B", "C"), 4))
> x$status<-as.factor(rep(c("D", "L"), 10))
>
> I want to do t.tests on the numeric observations within the data.frame by
> "site" and by "status":
>
> t.test(x[x$site == "A" & x$status =="D",]$X1, x[x$site == "A" & x$status
> =="L",]$X1)
> t.test(x[x$site == "B" & x$status =="D",]$X1, x[x$site == "B" & x$status
> =="L",]$X1)
> t.test(x[x$site == "C" & x$status =="D",]$X1, x[x$site == "C" & x$status
> =="L",]$X1)
>
> t.test(x[x$site == "A" & x$status =="D",]$X2, x[x$site == "A" & x$status
> =="L",]$X2)
> t.test(x[x$site == "B" & x$status =="D",]$X2, x[x$site == "B" & x$status
> =="L",]$X2)
> t.test(x[x$site == "C" & x$status =="D",]$X2, x[x$site == "C" & x$status
> =="L",]$X2)
>
> etc...
>
> I know I must be able to do this more efficently using a loop and one of
> the apply functions, e.g. something like this:
>
> k=length(levels(x$site))
> for (i in 1:k)
> {
> site<-levels(x$site)[i]
> x1<-x[x$site == site, ]
> results[i]<-apply(x1, 2, function(x1) {t.test(x1[x1$status == "D",],
> x1[x1$status == "L",])})
> results
> }
>
> But I can't figure out how to do the apply function correctly...
>
> Also wonder whether there's a way to use the apply-type function and aviod
> the loop all together.
>
> Thanks in advance!
>
> Ali
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to