Stavros Macrakis <macrakis <at> alum.mit.edu> writes: > > read.table gives idiosyncratic results when the input is formatted > strangely, for example: > > read.table(textConnection( "a'b\nc'd\n"),header=FALSE, fill=TRUE,sep="",quote="'") > => "c'd" "a'b" "c'd" > > > read.table(textConnection( "a'b\nc'd\nf'\n'\n"), header=FALSE,fill=TRUE sep="",quote="'") > => "f'" "\na" "b" "c'd" "f'" "\n" > > Though read.table doesn't specify the syntax of its input precisely, these > results don't seem particularly useful or consistent. > > Is there a stricter version of read.table (perhaps in a package) that gives > errors or warnings if it finds quotation marks in the middle of fields or > encounters other such peculiar situations?
I dissected this behavior a bit more here <https://stat.ethz.ch/pipermail/r-devel/2010-November/059016.html> (it is due to an inconsistency between the way that scan() and readLines() handle lines with unterminated quotes, IIRC) and Martin Maechler said <https://stat.ethz.ch/pipermail/r-devel/2010-November/059107.html> "I think it can be defended to file as a bug, but it is tricky to pinpoint exactly what the issue is." I don't know of a stricter version of read.table(), but if you had the time and inclination to pick through the code and (i) provide a careful definition of desired behavior and (ii) supply patches, you could do your little bit to make R better. (If I posted a bug report would you annotate it with a discussion of desired behavior?) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.