Rob Steele wrote:
> I'm finding that readLines() and read.fwf() take nearly two hours to
> work through a 3.5 GB file, even when reading in large (100 MB) chunks.
>  The unix command wc by contrast processes the same file in three
> minutes.  Is there a faster way to read files in R?

I use statist to convert the fixed width data file into a csv file
because read.table() is considerably faster than read.fwf(). For example:

system("statist --na-string NA --xcols collist big.txt big.csv")
bigdf <- read.table(file = "big.csv", header=T, as.is=T)

The file collist is a text file whose lines contain the following
information:

variable begin end

where "variable" is the column name, and "begin" and "end" are integer
numbers indicating where in big.txt the columns begin and end.

Statist can be downloaded from: http://statist.wald.intevation.org/

-- 
Jakson Aquino
Social Sciences Department
Federal University of CearĂ¡, Brazil

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to