I meant you should close the file when you are done with it, not after every few lines. File descriptors are a limited resource.
As for the rationale for the default behavior, there is a common use pattern of reading and parsing an entire file (or url, etc.), examining the results, and trying again with a different parsing scheme. In that case the default behavior works well. In any case, I assume the behavior is documented in help("file"). Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Oct 29, 2014 at 9:51 AM, Thomas Nyberg <tomnyb...@gmail.com> wrote: > Thanks for the response! I'd rather keep the file open than close it, since > it would flush the internal buffer. The whole reason I'm doing this is to > take advantage of the buffering and closing it would defeat the purpose. > > I actually just found a solution which is to open the files with the "r" > flag explicitly. I.e. the following is what I want. > > ----- > > bash $ echo 1 > testfile > bash $ echo 2 >> testfile > bash $ cat testfile > 1 > 2 > > bash $ R > R > f <- file('testfile', 'r') > R > readLines(f, n = 1) > [1] "1" > R > readLines(f, n = 1) > [1] "2" > R > readLines(f, n = 1) > character(0) > > ----- > > If you want to use writeLines in this same fashion you'll also need to open > the original file with the "w" as well. > > It's very odd that file('filename') will let you read from it, but will not > act the same as file('filename', 'r') when it comes to readLines. Is this a > bug or is there some reasoning behind this? Regardless, it's certainly > extremely unintuitive. > > Thanks again for the response! > > Cheers, > Thomas > > > On 10/29/2014 12:22 PM, William Dunlap wrote: >> >> Open your file object before calling readLines and close it when you >> are done with >> a sequence of calls to readLines. >> >> > tf <- tempfile() >> > cat(sep="\n", letters[1:10], file=tf) >> > f <- file(tf) >> > open(f) >> > # or f <- file(tf, "r") instead of previous 2 lines >> > readLines(f, n=1) >> [1] "a" >> > readLines(f, n=1) >> [1] "b" >> > readLines(f, n=2) >> [1] "c" "d" >> > close(f) >> >> I/O operations on an unopened connection generally open it, do the >> operation, >> then close it. >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> >> On Wed, Oct 29, 2014 at 8:23 AM, Thomas Nyberg <tomnyb...@gmail.com> >> wrote: >>> >>> Hi everyone, >>> >>> I would like to read a file line by line, but I would rather not load all >>> lines into memory first. I've tried using readLines with n = 1, but that >>> seems to reset the internal file descriptor's file offset after each >>> call. >>> I.e. this is the current behavior: >>> >>> ------- >>> >>> bash $ echo 1 > testfile >>> bash $ echo 2 >> testfile >>> bash $ cat testfile >>> 1 >>> 2 >>> >>> bash > R >>> R > f <- file('testfile') >>> R > readLines(f, n = 1) >>> [1] "1" >>> R > readLines(f, n = 1) >>> [1] "1" >>> >>> ------- >>> >>> I would like the behavior to be: >>> >>> ------- >>> >>> bash > R >>> R > f <- file('testfile') >>> R > readLines(f, n = 1) >>> [1] "1" >>> R > readLines(f, n = 1) >>> [1] "2" >>> >>> ------- >>> >>> I'm coming to R from a python background, where the default behavior is >>> exactly the opposite. I.e. when you read a line from a file it is your >>> responsibility to use seek explicitly to get back to the original >>> position >>> in the file (this is rarely necessary though). Is there some flag to turn >>> off the default behavior of resetting the file offset in R? >>> >>> Cheers, >>> Thomas >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.