Hello, I have a tab limited text document with multiple lines as mentioned below,
#FILE FORMAT #Book bookname author publisher pages #CD name content #################################################################################################### ---------------------------------------------------------------------- Book bioR xxx abc publishers 230 CD biorexamples chapter5 ---------------------------------------------------------------------- Book bioc++ mmm tata publishers 400 CD samples workexamples CD data experiments ---------------------------------------------------------------------- Book management tools aaa some publishers 200 ---------------------------------------------------------------------- here the texts "book" and "CD" are present in each block. now, I am interested in creating a data frame with two columns, column names="bookname" and "content". Using "grep" it is possible to pick specific rows (grep("^book, finename")) but my expertise in programming is limited to create the mentioned data.frame. Note: the rowname "book" is present in all blocks but "CD" is variable (ie., some block has two and some with no CD row, as shown above) please help me in creating something like this, bookname content [1] bioR chapter5 [2] bioc++ workexamples, experiments [3] management tools NA Thanks in advance, karthick -- View this message in context: http://r.789695.n4.nabble.com/Parsing-txt-file-tp3035749p3035749.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.