On 21.10.2011 23:32, Debs Majumdar wrote:
Hi, I have been given a set of around 300 files where there are 5 files corresponding to each chunk. E.g. Chunk 1 for chr1 contains these 5 files: chr1.one.phased.impute2.chunk1 chr1.one.phased.impute2.chunk1_info chr1.one.phased.impute2.chunk1_info_by_sample chr1.one.phased.impute2.chunk1_summary chr1.one.phased.impute2.chunk1_warnings For chr 1 there are 47 chunks, chr2 has 42 chunks...and it ends at chr22 with 23 chunks. I am using the DatABEL package to convert them databel format using the following command: impute2databel(genofile="chr1.one.phased.impute2.chunk1", samplefile="chr1.one.phased.impute2.chunk1_info", outfile="chr1.chunk1", makeprob=TRUE, old=FALSE) which uses two files per chunk. Is there a way I can automate this so that the code goes through each chunk of each chromosome and does the conversion to databel format.
Yes, probably (all untested): owd <- setwd(pth) fls <- list.files(pattern="^chr") ufls <- unique(sapply(strsplit(fls, "_"), "[", 1)) for(i in ufls){ of <- strsplit(i, "\\.")[[1]] of <- paste(of[1], tail(of, 1), sep=".") impute2databel(genofile = i, samplefile = paste(i, "info", sep="_"), outfile = of, makeprob=TRUE, old=FALSE) } setwd(owd) Uwe Ligges
Thanks, -Debs ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.