Hi: I would do something like the following:
(1) Create a vector of the file names. (2) Use lapply() to read the files into a list. (3) Use the reshape or reshape2 package to melt the individual files into 'long' form. (4) rbind together the resulting data frames. (5) Use a summarization function to generate the means and standard deviations. I created three data frames that have the structure you provided below and wrote them out to csv files. The following code creates a vector of file names, then uses lapply() to read the data files consecutively and assign them to components of a list,. Next, I create a small utility function that uses the reshape2 package to melt the data into 'long form'. The ldply function from package plyr is then called to apply the function to each file and then to bind them all together into a single data frame. Finally, the ddply() function in plyr is used to get the mean and standard deviation for each time/substance combination. #### Code to create test files for the example # File creation for test files: ds_create <- function() { times <- paste('Time', 1:10, sep = '') cnames <- paste('Substance', 1:5, sep = '') m <- matrix(rpois(50, 7), nrow = 10) colnames(m) <- cnames m <- as.data.frame(m) m$Time <- times write.csv(m, file = paste(name, '.csv', sep = ''), quote = FALSE, row.names = FALSE) } nms <- paste('m', 1:3, sep = '') sapply(nms, ds_create) #### # Vector of file names files <- paste('m', 1:3, '.csv', sep = '') # Read the data frames into a list, where each data frame is a separate component filelst <- lapply(files, read.csv, header = TRUE) library(plyr) library(reshape2) # Function to melt a generic data frame f <- function(df) { melt.data.frame(df, id = 'Time', variable_name = 'Substance', value_name = 'y') } # Apply the function to each component of the list and rbind the results together bigdf <- ldply(filelst, f) # Obtain the mean and sd for each Time/Substance combination bigsumm <- ddply(bigdf, .(Time, Substance), summarise, mean = mean(y), sd = sd(y)) # ---- Caveat: If you have the reshape package loaded, then at present the value_name = assignment will not go through and the name of the last variable will be 'value'. In that event, you can either rename 'value' to 'y' with names(bigdf)[3] <- 'y' or change 'y' to 'value' before you invoke ddply() on bigdf(). Check bigdf() with head(bigdf) to verify that the names expected are 'Time', 'Substance' and 'y' before running the last command. # ---- The result I get is > dim(bigsumm) [1] 50 4 > head(bigsumm) Time Substance mean sd 1 Time1 Substance1 10.333333 2.516611 2 Time1 Substance2 10.666667 1.154701 3 Time1 Substance3 6.000000 2.645751 4 Time1 Substance4 6.333333 1.154701 5 Time1 Substance5 5.333333 1.527525 6 Time10 Substance1 4.666667 3.055050 The structure is what matters. You should be able to extend this template to your 100 data frames. HTH, Dennis On Sun, May 1, 2011 at 8:48 AM, Nemergut, Edward *HS <e...@hscmail.mcc.virginia.edu> wrote: > I have 100+ .csv files which have the basic format: > >> test > X Substance1 Substance2 Substance3 Substance4 Substance5 > 1 Time1 10 0 0 0 0 > 2 Time2 9 5 0 0 0 > 3 Time3 8 10 1 0 0 > 4 Time4 7 20 2 1 0 > 5 Time5 6 25 3 2 1 > 6 Time6 5 30 4 2 2 > 7 Time7 4 25 5 3 3 > 8 Time8 3 20 6 3 4 > 9 Time9 2 15 5 3 5 > 10 Time10 1 10 4 4 6 > > Each table is of exactly the same dimensions. After reading each of the > 100+ .csv files into R, I want determine the mean and SD of each and every > cell. That is to ask, I to calculate the mean and SD for (Time1,Substance1) > and every other cell from each of the 100+ .csv files. > > I imagine this is a fairly basic question, but my search has been > unsuccessful. > > Thanks in advance, > ECN > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.