On Tue, Nov 23, 2010 at 6:05 AM, fbielejec <fbiele...@gmail.com> wrote: > Dear, > > I'm doing analysis where I need to work on relatively large (50-60 MB) > text files, though I'm really interested only in parts with binary > variables (named indicators1, indicators2, ... etc.) > > Every text file contains other numeric columns, but not always the same > and not always in the same order - therefore I would rather need a > method connecting to file and reading only colums with respect to name > pattern (ie indicators + number). That should speed things up (now I > have to clean data by hand) but also leave less memory footprint. Could > You point me towards sth? >
This is easy using read.csv.sql: library(sqldf) # create test file write.table(anscombe, "anscombe.csv", sep = ",", quote = FALSE, row.names = FALSE) # read it back but only indicated columns read.csv.sql("anscombe.csv", sql = "select x1, x2, y1, y2 from file") See ?read.csv.sql and also sqldf home page at http://sqldf.googlecode.com -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.