Hi: Try this:
## Function that takes a data frame as input and outputs a data frame: chrSumm <- function(d) { # d is a data frame colnames(d) <- c("chr","start","end","base1","base2", "totalreads","methylation","strand") TR <- nrow(d) RG1 <- sum(d['totalreads'] >= 1) percent <- TR/RG1 methylSumm <- summary(d$methylation) names(methylSumm) <- c('Min', 'Q1', 'Median', 'Mean', 'Q3', 'Max') data.frame(TR, RG1, percent, as.data.frame(as.list(methylSumm))) } # Read the data files into a list and apply the function to each file recursively, # resulting in a data frame # vector of file names files <- c('chr1.out.txt', 'chr2.out.txt') # use lapply() to read files into a list filelist <- lapply(files, read.table, header = FALSE) # Use the ldply() function from the plyr package to # process the list and return a data frame library('plyr') ldply(filelist, chrSumm) # Result from your example: > ldply(filelist, chrSumm) TR RG1 percent Min Q1 Median Mean Q3 Max 1 4 4 1.0 0.04 0.0475 0.07 0.07500 0.0975 0.12 2 3 2 1.5 0.00 0.0150 0.03 0.03667 0.0550 0.08 HTH, Dennis On Tue, Aug 9, 2011 at 9:31 PM, a217 <aj...@case.edu> wrote: > Hello, > > I have an R script that I use as a template to perform a task for multiple > files (in this case, multiple chromosomes). > > What I would like to do is to utilize a simple loop to parse through each > chromosome number so that I don't have to type the same code over and over > again in the R console. > > I've tried using: > > for(i in 1:22){ > etc.. > } > > and replacing each chromosome number with [[i]], but that did not seem to > work. > > Below is the script I have. Basically everywhere you see a '2' I would like > there to be an 'i' so that the script can be applied in a general sense. > ################################Code############################### > > chr2.data<-read.table(file="chr2.out.txt", header=F) > colnames(chr2.data)<-c("chr","start","end","base1","base2","totalreads","methylation","strand") > splc2<-split(chr2.data, paste(chr2.data$chr)) > chr2.df<-as.data.frame(t(sapply(splc2, function(x) > list(TR=NROW(x[['totalreads']]), RG1=sum(x[['totalreads']]>=1), > percent=(NROW(x[['totalreads']]>=1)/sum(x[['totalreads']])))))) > chr2.df.summ<-as.data.frame(t(sapply(splc2, function(x) > summary(x$methylation)))) > chr2.summ<-cbind(chr2.df,chr2.df.summ) > > ################################################################## > > > Here are some sample input files in case you'd like to test the code: > ########## > # chr1.out.txt > ########## > chr1 100 159 104 104 1 0.05 + > chr1 100 159 145 145 1 0.04 + > chr1 200 260 205 205 1 0.12 + > chr1 500 750 600 600 1 0.09 + > > ########## > # chr2.out.txt > ########## > chr2 100 200 105 105 1 0.03 + > chr2 100 200 110 110 1 0.08 + > chr2 300 400 350 350 0 0 + > > > The code works perfectly fine just typing everything out by hand, but that > is very inefficient given that there are 24 chromosomes for each dataset. I > am just looking for any suggestions as to how I can write a general version > of this code. > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Loops-for-repetitive-task-tp3732022p3732022.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.