In the Windows cmd shell ^ means escape the next character so try this (assuming the data you posted is in genetest.dat in the current directory):
> readLines(pipe("findstr/b ^> genetest.dat")) [1] ">gene A;....." ">gene B;...." and on UNIX replace "..." with the corresponding grep command making sure you appropriately escape the > depending on the shell you use. On Tue, Sep 15, 2009 at 4:59 PM, J Chen <jiaxuan.c...@mdc-berlin.de> wrote: > > Dear all, > > I have DNA sequence data which are fasta-formatted as > >>gene A;..... > AAAAACCCC > TTTTTGGGG > CCCTTTTTT >>gene B;.... > CCCCCAAAA > GGGGGTTTT > > I want to load only the lines that start with ">" where the annotation > information for the gene is contained. In principle, I can remove the > sequences before loading or after loading all the lines. I just wonder if > there's a way to load only lines with a particular pattern. The skip > argument in read.table() doesn't work for my purpose. > > Thanks in advance, > Jimmy > > -- > View this message in context: > http://www.nabble.com/how-to-load-only-lines-that-start-with-a-particular-symbol-tp25461693p25461693.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.