*Dear R users, Ive just started using the ff package. There is a csv file (~4Gb) with 7 columns and 6e+7 rows. I want to read only column from the file, skipping the first 100 rows. Below Ive provided different outcomes, which will clarify my problem * > sessionInfo() R version 2.14.2 (2012-02-29) Platform: x86_64-pc-mingw32/x64 (64-bit)
locale: ... attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] ff_2.2-7 bit_1.1-8 ##--------------------------------------------------------------------------------------- ## *I want to read the second column only:* x.class <- c('NULL', 'numeric','NULL','NULL','NULL', 'NULL', 'NULL') ##* The following command works fine:* > read.csv.ffdf(file=csvfile, header=FALSE, skip=100, > colClasses=x.class, nrows=1e3) ffdf (all open) dim=c(1000,1), dimorder=c(1,2) row.names=NULL ffdf virtual mapping PhysicalName VirtualVmode PhysicalVmode AsIs VirtualIsMatrix V2 V2 double double FALSE FALSE PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol V2 FALSE 1 1 1 PhysicalIsOpen V2 TRUE ffdf data V2 1 -0.5412 2 -0.5842 3 -0.5920 4 -0.5451 5 -0.5099 6 -0.5021 7 -0.4943 8 -0.5490 : : 993 -0.4865 994 -0.6584 995 -0.7482 996 -0.8732 997 -0.8303 998 -0.7248 999 -0.5490 1000 -0.4240 *Then I extend nrows by 1, I get warning about number of columns:* > read.csv.ffdf(file=csvfile, header=FALSE, skip=100, > colClasses=x.class, nrows=1001) ffdf (all open) dim=c(1001,1), dimorder=c(1,2) row.names=NULL ffdf virtual mapping PhysicalName VirtualVmode PhysicalVmode AsIs VirtualIsMatrix V2 V2 double double FALSE FALSE PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol V2 FALSE 1 1 1 PhysicalIsOpen V2 TRUE ffdf data V2 1 -0.5412 2 -0.5842 3 -0.5920 4 -0.5451 5 -0.5099 6 -0.5021 7 -0.4943 8 -0.5490 : : 994 -0.6584 995 -0.7482 996 -0.8732 997 -0.8303 998 -0.7248 999 -0.5490 1000 -0.4240 1001 -0.3849 Warning message: In read.table(file = file, header = header, sep = sep, quote = quote, : cols = 1 != length(data) = 7 > *Then, going much beyond 1000 brings problems:* > read.csv.ffdf(file=csvfile, header=FALSE, skip=100, > colClasses=x.class, nrows=1e4) Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names *Question is why? The number of columns does not change in the file... I will appreciate any help.. Best, Robert * -- View this message in context: http://r.789695.n4.nabble.com/ff-package-reading-selected-columns-from-csv-tp4637794.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.