There are lots of ways to do this. You have to decide on how you want to organize the results. Here are two ways that use only core R packages. Many people like the plyr package for this split-data/analyze-parts/combine-results sort of thing.
> df <- data.frame(x=1:27,response=log2(1:27), g1=rep(letters[1:2],len=27),g2=rep(LETTERS[24:26],c(10,10,7))) > s <- split(seq_len(nrow(df)), df[c("g1","g2")]) > mean(subset(df, df$g1=="a" & df$g2=="Z")$response) [1] 4.578656 > vapply(s, function(si)mean(df$response[si]), FUN.VALUE=0) # a.Z part is previous result a.X b.X a.Y b.Y a.Z b.Z 1.976834 2.381378 3.880430 3.976834 4.578656 4.581611 > coef(lm(response~x, data=subset(df, df$g1=="a" & df$g2=="Z"))) # regression example (Intercept) x 3.12905040 0.06040022 > vapply(s, function(si)coef(lm(response ~ x, data=df[si,])), FUN.VALUE=rep(0,2)) a.X b.X a.Y b.Y a.Z b.Z (Intercept) 0.0862735 0.6882213 2.40741927 2.50763309 3.12905040 3.13556268 x 0.3781121 0.2821928 0.09820075 0.09182506 0.06040022 0.06025202 For the particular case of computing means of a partition of the data you can use lm() once, which gives the same numbers organized in a different way: > coef(lm(response ~ x * (g1:g2) - x - 1, data=df)) g1a:g2X g1b:g2X g1a:g2Y g1b:g2Y g1a:g2Z g1b:g2Z 0.08627350 0.68822126 2.40741927 2.50763309 3.12905040 3.13556268 x:g1a:g2X x:g1b:g2X x:g1a:g2Y x:g1b:g2Y x:g1a:g2Z x:g1b:g2Z 0.37811212 0.28219281 0.09820075 0.09182506 0.06040022 0.06025202 Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Jan 15, 2015 at 11:42 AM, Reid Bryant <reidbry...@gmail.com> wrote: > Hi R experts! > > I would like to have a scripted solution that will iteratively subset data > across many variables per factor level of each variable. > > To illustrate, if I create a dataframe (df) by: > > variation <- c("A","B","C","D") > element1 <- as.factor(c(0,1,0,1)) > element2 <- as.factor(c(0,0,1,1)) > response <- c(4,2,6,2) > df <- data.frame(variation,element1,element2,response) > > I would like a function that would allow me to subset the data into four > groups and perform analysis across the groups. One group for each of the > two factor levels across two variables. In this example its fairly easy > because I only have two variables with two levels each, but would I would > like this to be extendable across situations where I am dealing with more > than 2 variables and/or more than two factor levels per variable. I am > looking for a result that will mimic the output of the following: > > element1_level0 <- subset(df,df$element1=="0") > element1_level1 <- subset(df,df$element1=="1") > element2_level0 <- subset(df,df$element2=="0") > element2_level1 <- subset(df,df$element2=="1") > > The purpose would be to perform analysis on the df across each subset. > Simplistically this could be represented as follows: > > mean(element1_level0$response) > mean(element1_level1$response) > mean(element2_level0$response) > mean(element2_level1$response) > > Thanks, > Reid > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.