On Jan 25, 2013, at 1:37 PM, Tim Howard wrote:

> David,
> Thank you again for the reply. I'll try to make readLines() and strplit() 
> work.  What bugs me is that I think it would import fine if the folks who 
> created the csv had used double quotes "" rather than an escaped quote \" for 
> those pesky internal quotes. Since that's the case, I'd think there would be 
> a solution within read.csv() ... or perhaps scan()?, I just can't figure it 
> out. 

Can you pre-process with an editor? Replace all the ", " hits with something 
like '|'.

-- 
David.
> best,
> Tim
> 
> >>> David Winsemius <dwinsem...@comcast.net> 1/25/2013 4:16 PM >>>
> 
> On Jan 25, 2013, at 11:35 AM, Tim Howard wrote:
> 
> > Great point, your fix (quote="") works for the example I gave. 
> > Unfortunately, these text strings have commas in them as well(!).  Throw a 
> > few commas in any of the text strings and it breaks again.  Sorry about not 
> > including those in the example.
> >  
> > So, I need to incorporate commas *and* quotes with the escape character 
> > within a single string.
> 
> Well you need to have _some_ delimiter. At the moment it sounds as though you 
> might end upusing readLines() and strsplit( . , split="\\'\\,\\s\\").
> 
> -- 
> david.
> 
> >  
> > Tim
> >  
> > 
> > >>> David Winsemius <dwinsem...@comcast.net> 1/25/2013 2:27 PM >>>
> > 
> > On Jan 25, 2013, at 10:42 AM, Tim Howard wrote:
> > 
> > > All,
> > > 
> > > I have some csv files I am trying to import. I am finding that quotes 
> > > inside strings are escaped in a way R doesn't expect for csv files. The 
> > > problem only seems to rear its ugly head when there are an uneven number 
> > > of internal quotes. I'll try to recreate the problem:
> > > 
> > > # set up a matrix, using escape-quote as the internal double quote mark.
> > > 
> > > x <- data.frame(matrix(data=c("1", "string one", "another string", "2", 
> > > "quotes escaped 10' 20\" 5' 30\" \"test string", "final string", 
> > > "3","third row","last \" col"),ncol = 3, byrow=TRUE))
> > > 
> > >> write.csv(x, "test.csv")
> > > 
> > > # NOTE that write.csv correctly created the three internal quotes ' " ' 
> > > by using double quotes ' "" '. 
> > > # here's what got written
> > > 
> > > "","X1","X2","X3"
> > > "1","1","string one","another string"
> > > "2","2","quotes escaped 10' 20"" 5' 30"" ""test string","final string"
> > > "3","3","third row","last "" col"
> > > 
> > > # Importing test.csv works fine.
> > > 
> > >> read.csv("test.csv")
> > >  X X1                                         X2             X3
> > > 1 1  1                                 string one another string
> > > 2 2  2 quotes escaped 10' 20" 5' 30" "test string   final string
> > > 3 3  3                                  third row     last " col
> > > # this looks good. 
> > > # now, please go and open "test.csv" with a text editor and replace all 
> > > the double quotes '""' with the 
> > > # quote escaped ' \" ' as is found in my data set. Like this:
> > > 
> > > "","X1","X2","X3"
> > > "1","1","string one","another string"
> > > "2","2","quotes escaped 10' 20\" 5' 30\" \"test string","final string"
> > > "3","3","third row","last \" col"
> > 
> > Use quote="":
> > 
> > > read.csv(text='"","X1","X2","X3"
> > + "1","1","string one","another string"
> > + "2","2","quotes escaped 10\' 20"" 5\' 30"" ""test string","final string"
> > + "3","3","third row","last "" col"', sep=",", quote="")
> > 
> > Not ...., quote="\""
> > 
> > 
> >   X.. X.X1.                                           X.X2.            X.X3.
> > 1 "1"   "1"                                    "string one" "another string"
> > 2 "2"   "2" "quotes escaped 10' 20"" 5' 30"" ""test string"   "final string"
> > 3 "3"   "3"                                     "third row"    "last "" col"
> > 
> > You will then be depending entirely on commas to separate. 
> > 
> > (Needed to use escaped single quotes to illustrate from a command line.)
> > 
> > > 
> > > # this breaks read.csv:
> > > 
> > >> read.csv("test.csv")
> > >  X X1                                                                     
> > >                X2             X3
> > > 1 1  1                                                                    
> > >         string one another string
> > > 2 2  2 quotes escaped 10' 20\\ 5' 30\\ \\test ( file://\test ) 
> > > string,final string\n3,3,third row,last \\ col      
> > > 
> > > # we now have only two rows, with all the data captured in col2 row2
> > > 
> > > Any suggestions on how to fix this behavior? I've tried fiddling with 
> > > quote="\"" to no avail, obviously. Interestingly, an even number of 
> > > escaped quotes within a field is loaded correctly, which certainly threw 
> > > me for a while!
> > > 
> > > Thank you in advance, 
> > > Tim
> > > 
> > > 
> > 
> > David Winsemius
> > Alameda, CA, USA
> > 
> 
> David Winsemius
> Alameda, CA, USA
> 

David Winsemius
Alameda, CA, USA


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to