Using grep in a pipeline is pretty fast so if that is workable that is probably the way to go; however, one other possibility is to use read.csv.sql from the sqldf package. read.csv.sql allows you to specify an sql statement that it will use to filter the data. It reads the data into a temporary sqlite database (which it automatically sets up for you and uses sqlite, not R, to do that) and then applies the sql statement reading the presumably much smaller result into R and finally automatically destroy the temporary database. Whether that is faster or slower than the alternatives could easily be tested as read.csv.sql takes only one line of code. See the examples at http://sqldf.googlecode.com
On Thu, Dec 3, 2009 at 9:34 PM, Peng Yu <pengyu...@gmail.com> wrote: > I'm thinking of using external program 'grep' and pipe() to do so. But > I'm wondering if there is a more efficient way to do so purely in R > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.