I have a collection of datasets in separate data frames which have 3 independent test parameters (w, x, y) and one dependent variable (z) , together with some additional static test data on each row. What I want is a data frame which contains the test data, the parameters (w, x, y) and the mean value of all (z)s in the Z column.
Each datasets has around 6000 rows and around 7 columns, which doesn't seem outrageously large, so it seems like this shouldn't too time consuming, but the way I've been approaching it seems to take way too long (20 seconds for datasets over 4 runs, longer for my datasets over 10 runs). My imperative-coding brain lead me to use for loops, which seems to be particularly problematic for R performance. My first attempt at this looked like the following, which takes roughly 60 seconds to complete. I rewrote it a little, but the code was much longer and effectively replaces one of the for loops with an lapply(). I could paste the other code, but it's much longer and less clear about its intent. ####################### # Start code snippet ####################### ### inputFiles just a list of paths to the test runs testRuns <- lapply(inputFiles, function(x) { read.table(x, header=TRUE)}) ### W, X, Y have (small) natural values w <- unique(testRuns[[1]]$W) x <- unique(testRuns[[1]]$X) y <- unique(testRuns[[1]]$Y) ### All runs have the same values for all columns ### with the exception of the Z values, so just ### copy the first test run data testMeans <- data.frame(testRuns[[1]]) for(w0 in w) { for(y0 in y) { for (x0 in x) { row <- which(testMeans$W == w0 & testMeans$Y == y0 & testMeans$X == x0) meanValues <- sapply(testRuns, function(r) {mean( subset(r, r$W == w0 & r$Y == y0 & r$X == x0)$Z )}) testMeans[row,]$Z = mean(meanValues) } } } ### I will then want to plot certain values over (X, Z), ### so ultimately, I'm going to subset the data further. ### Code which gives me a list of W tables with mean Z values ### works, too. ####################### # End code snippet ####################### Thanks, mike -- Michael R. Head <[EMAIL PROTECTED]> http://www.cs.binghamton.edu/~mike/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.