Hello, I have an input file: http://r.789695.n4.nabble.com/file/n3700031/testOut.txt testOut.txt
where col 1 is chromosome, column2 is start of region, column 3 is end of region, column 4 and 5 is base position, column 6 is total reads, column 7 is methylation data, and column 8 is the strand. I would like a summary output file such as: http://r.789695.n4.nabble.com/file/n3700031/out.summary.txt out.summary.txt where column 1 is chromosome, column 2 is start of region, column 3 is end of region, column 4 is total reads in general, column 5 is total reads >=1, column 6 is (col4/col5) or the percentage, and at the end I'd like to list 6 more columns based on summary results from summary() function in R. The summary() function will be used to analyze all of the methylation data (col7 from input) for each region (bounded by col2 and col3). For example for chr1 100 159 summary() gives: Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0400 0.0425 0.0450 0.0450 0.0475 0.0500 which is simply the methylation data input into summary() only in the region of chr1 100 159. I know how to perform all of the required functions line-by-line, but the hard part for me is essentially taking the input data with multiple positions in each region and assigning all of the summary results to one line identified by the region. If any of you have any suggestions I would appreciate it. -- View this message in context: http://r.789695.n4.nabble.com/Writing-a-summary-file-in-R-tp3700031p3700031.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.