Re: [R] reading large csv data sets efficiently

Whit Armstrong Wed, 22 May 2013 13:50:54 -0700

http://cran.r-project.org/web/packages/data.table/index.html



On Wed, May 22, 2013 at 12:31 PM, ivo welch <ivo.we...@anderson.ucla.edu>wrote:

> I have a couple of large data sets, on the order of 4GB.  they come in .csv
> files, with about 50 columns and lots of rows.  a couple have weird NA
> values, such as "C" and "B", in numeric columns.
>
> I am wondering how good read.csv() is dealing with this stuff on the first
> pass.
>
> d<-(read.csv("t.csv", colClasses=c(NA, NA, "NULL", "NULL",
> "numeric","numeric", "numeric", "numeric"), na.strings=c("C","B")))
>
> does R first read the entire file and then worry about colClasses and
> na.strings, or does it handle this line by line as it goes?
>
> (if it does the former, I can write a perl pre-filter)
>
> /iaw
>
> ----
> Ivo Welch (ivo.we...@gmail.com)
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading large csv data sets efficiently

Reply via email to