On Jan 25, 2013, at 1:37 PM, Tim Howard wrote: > David, > Thank you again for the reply. I'll try to make readLines() and strplit() > work. What bugs me is that I think it would import fine if the folks who > created the csv had used double quotes "" rather than an escaped quote \" for > those pesky internal quotes. Since that's the case, I'd think there would be > a solution within read.csv() ... or perhaps scan()?, I just can't figure it > out.
Can you pre-process with an editor? Replace all the ", " hits with something like '|'. -- David. > best, > Tim > > >>> David Winsemius <dwinsem...@comcast.net> 1/25/2013 4:16 PM >>> > > On Jan 25, 2013, at 11:35 AM, Tim Howard wrote: > > > Great point, your fix (quote="") works for the example I gave. > > Unfortunately, these text strings have commas in them as well(!). Throw a > > few commas in any of the text strings and it breaks again. Sorry about not > > including those in the example. > > > > So, I need to incorporate commas *and* quotes with the escape character > > within a single string. > > Well you need to have _some_ delimiter. At the moment it sounds as though you > might end upusing readLines() and strsplit( . , split="\\'\\,\\s\\"). > > -- > david. > > > > > Tim > > > > > > >>> David Winsemius <dwinsem...@comcast.net> 1/25/2013 2:27 PM >>> > > > > On Jan 25, 2013, at 10:42 AM, Tim Howard wrote: > > > > > All, > > > > > > I have some csv files I am trying to import. I am finding that quotes > > > inside strings are escaped in a way R doesn't expect for csv files. The > > > problem only seems to rear its ugly head when there are an uneven number > > > of internal quotes. I'll try to recreate the problem: > > > > > > # set up a matrix, using escape-quote as the internal double quote mark. > > > > > > x <- data.frame(matrix(data=c("1", "string one", "another string", "2", > > > "quotes escaped 10' 20\" 5' 30\" \"test string", "final string", > > > "3","third row","last \" col"),ncol = 3, byrow=TRUE)) > > > > > >> write.csv(x, "test.csv") > > > > > > # NOTE that write.csv correctly created the three internal quotes ' " ' > > > by using double quotes ' "" '. > > > # here's what got written > > > > > > "","X1","X2","X3" > > > "1","1","string one","another string" > > > "2","2","quotes escaped 10' 20"" 5' 30"" ""test string","final string" > > > "3","3","third row","last "" col" > > > > > > # Importing test.csv works fine. > > > > > >> read.csv("test.csv") > > > X X1 X2 X3 > > > 1 1 1 string one another string > > > 2 2 2 quotes escaped 10' 20" 5' 30" "test string final string > > > 3 3 3 third row last " col > > > # this looks good. > > > # now, please go and open "test.csv" with a text editor and replace all > > > the double quotes '""' with the > > > # quote escaped ' \" ' as is found in my data set. Like this: > > > > > > "","X1","X2","X3" > > > "1","1","string one","another string" > > > "2","2","quotes escaped 10' 20\" 5' 30\" \"test string","final string" > > > "3","3","third row","last \" col" > > > > Use quote="": > > > > > read.csv(text='"","X1","X2","X3" > > + "1","1","string one","another string" > > + "2","2","quotes escaped 10\' 20"" 5\' 30"" ""test string","final string" > > + "3","3","third row","last "" col"', sep=",", quote="") > > > > Not ...., quote="\"" > > > > > > X.. X.X1. X.X2. X.X3. > > 1 "1" "1" "string one" "another string" > > 2 "2" "2" "quotes escaped 10' 20"" 5' 30"" ""test string" "final string" > > 3 "3" "3" "third row" "last "" col" > > > > You will then be depending entirely on commas to separate. > > > > (Needed to use escaped single quotes to illustrate from a command line.) > > > > > > > > # this breaks read.csv: > > > > > >> read.csv("test.csv") > > > X X1 > > > X2 X3 > > > 1 1 1 > > > string one another string > > > 2 2 2 quotes escaped 10' 20\\ 5' 30\\ \\test ( file://\test ) > > > string,final string\n3,3,third row,last \\ col > > > > > > # we now have only two rows, with all the data captured in col2 row2 > > > > > > Any suggestions on how to fix this behavior? I've tried fiddling with > > > quote="\"" to no avail, obviously. Interestingly, an even number of > > > escaped quotes within a field is loaded correctly, which certainly threw > > > me for a while! > > > > > > Thank you in advance, > > > Tim > > > > > > > > > > David Winsemius > > Alameda, CA, USA > > > > David Winsemius > Alameda, CA, USA > David Winsemius Alameda, CA, USA [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.