On 13-01-21 10:56 AM, Collins, Stephen wrote:
Hello,
I'm trying to read a file rows at a time, so as to not read the entire file into memory. When reading the
"connections" and "readLines" help, and "R help archive," it seems this should be possible with
read.csv and a file connection, making use of the "nrows" argument, and checking where the "nrow()" of the
new batch is zero rows.
From certain posts, it seemed that read.csv should return "character(0)" when the end of
file is reached, and there are no more rows to read. Instead, I get an error there are "no
lines available for input." Have I made a mistake with the file, or calling read.csv?
What is the proper way to check the end-of-file condition with read.csv, such
that I could break a while loop reading the data in?
#example, make a test file
con <- file("test.csv","wt")
cat("a,b,c\n", "1,2,3\n", "4,5,6\n", "7,6,5\n", "4,3,2\n", "3,2,1\n",file=con)
unlink(con)
I don't think this is causing your problem, but unlink() seems like the
wrong function to use here. Don't you mean close()?
#show the file is valid
con <- file("test.csv","rt")
read.csv(con,header=T)
unlink(con)
#show that readLines ends with "character(0)", like expected
con <- file("test.csv","rt")
readLines(con,n=10)
readLines(con,n=10)
unlink(con)
#show that read.csv end with error
con <- file("test.csv","rt")
read.csv(con,header=T,nrows=10)
read.csv(con,header=F,nrows=10)
unlink(con)
See the Value section of ?read.csv. In particular,
"Empty input is an error unless col.names is specified, when a 0-row
data frame is returned: similarly giving just a header line if header =
TRUE results in a 0-row data frame. Note that in either case the columns
will be logical unless colClasses was supplied."
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.