Yes, your solution of setting quote="" would read the multi-word strings incorrectly. A more complicated version of your solution should work: First check which columns are identified as strings, and then apply your solution to the remaining columns.
I'm a newbie at R, but it seems to me that there is a "logical inconsistency" in R: write.table puts quotes around numbers when they form a column of factors, but does not put quotes for a column of integers. Since read.table is the "dual" of write.table it seems that it should treat quoted and unquoted columns differently, analogously to write.table. However, there does not even seem to be an option to make read.table behave analogously. ----- Original Message ---- From: peter dalgaard <pda...@gmail.com> To: james hirschorn <j_hirsch...@yahoo.com> Cc: r-help@r-project.org Sent: Tue, October 5, 2010 7:25:52 AM Subject: Re: [R] read columns of quoted numbers as factors On Oct 4, 2010, at 18:39 , james hirschorn wrote: > Suppose I have a data file (possibly with a huge number of columns), where > the > columns with factors are coded as "1", "2", "3", etc ... The default behavior >of > > read.table is to convert these columns to integer vectors. > > Is there a way to get read.table to recognize that columns of quoted numbers > represent factors (while unquoted numbers are interpreted as integers), > without > > explicitly setting them with colClasses ? I don't think there's a simple way, because the modus operandi of read.table is to read everything as character and then see whether it can be converted to numeric, and at that point any quotes will have been lost. One possibility, somewhat dependent on the exact file format, would be to temporarily set quote="", see which columns contains quote characters, and, on a second pass, read those columns as factors, using a computed colClasses argument. It will break down if you have space-separated columns with quoted multi-word strings, though. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.