Hi Utpal, You can use the same script from my previous email.
Lines1<- readLines("groups.txt") library(stringr) res<-paste(gsub("(.*\\:).*","\\1",Lines1),unlist(lapply(str_match_all(Lines1,"or10\\|\\d+"),paste,collapse=" ")),sep=" ") write.table(res,"res1.txt",row.names=FALSE,col.names=FALSE,quote=FALSE) Lines2<- readLines("res1.txt") length(Lines2) #[1] 4633 head(Lines2) #[1] "OG1: or10|1345 or10|387 or10|474" #[2] "OG2: or10|1562 or10|1584 or10|1977" #[3] "OG3: or10|1636 or10|1990 or10|2257 or10|2258 or10|2499" #[4] "OG4: or10|600" #[5] "OG5: or10|1053 or10|2869" #[6] "OG6: or10|2798 or10|568" A.K. ________________________________ From: Utpal Bakshi <utpalb4...@gmail.com> To: smartpink111 <smartpink...@yahoo.com> Sent: Friday, April 5, 2013 1:36 PM Subject: Offcourse... The groups.txt file is my input file.. I also attached the core genome program.. the scripts in myscript file uses the function written by Andreas Sjodin that is in coregenome.R file.. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.