Thanks Henrique... giving it a try now, but it'll take a good while, given the file size.
Cheers, b On 27 March 2012 02:35, Henrique Dallazuanna <www...@gmail.com> wrote: > Benilton, > > Try this: > > read.table(textConnection(gsub('","', "','", gsub('^\"|\"$', "'", > readLines('../teste.csv')))), sep = ',', quote = "'", header = TRUE) > > On Mon, Mar 26, 2012 at 8:09 PM, Benilton Carvalho > <beniltoncarva...@gmail.com> wrote: > > I need to read in csv files, created by 3rd party, with fields > > containing single quotes (as shown below). > > > > "header1","header2","header3","header4" > > "field1r1","field2r1","field3r1","field4r1" > > "field1r2","field2r2","field3r2PartA), field3r2PartB Very" > Long","field4r2" > > "field1r3","field2r3","field3r3","field4r3" > > > > > > read.csv(filename, quote="\"'", header=TRUE) won't read the file > > represented above, unless the 3rd line has Very"" (double quotes) > > instead of Very" (single quotes)... and this is documented (scan() man > > page). > > > > Assuming that the creation of such csv files is something I'm not in a > > position to interfere with, are there (preferably, "all in R") > > suggestions on how to handle such task? > > > > For the moment, I'm using my poor man's solution (below), but any > > tricks that would simplify this task would be great. > > > > Thank you very much, > > > > benilton > > > > > > parser <- function(fname, header=TRUE, stringsAsFactors=FALSE){ > > txt <- readLines(fname) > > txt <- gsub("^\"|\"$", "", txt) > > txt <- strsplit(txt, "\",\"") > > txt <- do.call(rbind, lapply(txt, function(x) gsub("\"", "\"\"", x))) > > if (header){ > > nms <- txt[1,] > > txt <- txt[-1,] > > } > > txt <- as.data.frame(txt, stringsAsFactors=stringsAsFactors) > > if (header) names(txt) <- nms > > txt > > } > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Henrique Dallazuanna > Curitiba-Paraná-Brasil > 25° 25' 40" S 49° 16' 22" O > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.