Le vendredi 18 octobre 2013 à 13:27 -0400, Earl Brown a écrit : > Thanks Duncan. However, now I can't get the Spanish and Portuguese accented > vowels to come out correctly and still keep the indents in the saved > document, even when I set encoding = "UTF-8": > > library("XML") > concepts <- c("español", "português") > info <- c("info about español", "info about português") > > doc <- newXMLDoc() > root <- newXMLNode("tips", doc = doc) > for (i in 1:length(concepts)) { > cur.concept <- concepts[i] > cur.info <- info[i] > cur.tip <- newXMLNode("tip", attrs = c(id = i), parent = root) > newXMLNode("h1", cur.concept, parent = cur.tip) > newXMLNode("p", cur.info, parent = cur.tip) > } > > # accented vowels don't come through correctly, but the indents are correct: > saveXML(doc, file = "test1.xml", indent = T) > > Resulting file looks like this: > <?xml version="1.0"?> > <tips> > <tip id="1"> > <h1>español</h1> > <p>info about español</p> > </tip> > <tip id="2"> > <h1>português</h1> > <p>info about português</p> > </tip> > </tips> > > # accented vowels are correct, but the indents are no longer correct: > saveXML(doc, file = "test2.xml", indent = T, encoding = "UTF-8") > > Resulting file: > <?xml version="1.0" encoding="UTF-8"?> > <tips><tip id="1"><h1>español</h1><p>info about español</p></tip><tip > id="2"><h1>português</h1><p>info about português</p></tip></tips> > > I tried to workaround the problem by simply loading in that resulting > file and saving it again: > doc2 <- xmlInternalTreeParse(file = "test2.xml", asTree = T) > saveXML(doc2, file = "test_word_around.xml", indent = T) > > but still don't get the indents. > > Does setting encoding = "UTF-8" override indents = TRUE in saveXML()? I can confirm the same issue happens here. What is interesting is that without the 'file' argument, the returned string includes the expected line breaks and spacing. These do not appear when redirecting the output to a file.
> saveXML(doc, encoding="UTF-8", indent=T) [1] "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<tips>\n <tip id=\"1 \">\n <h1>español</h1>\n <p>info about español</p>\n </tip>\n <tip id=\"2\">\n <h1>português</h1>\n <p>info about português</p>\n </tip>\n</tips>\n" > saveXML(doc, encoding="UTF-8", indent=T, file="test.xml") Contents of test.xml: <?xml version="1.0" encoding="UTF-8"?> <tips><tip id="1"><h1>español</h1><p>info about español</p></tip><tip id="2"><h1>português</h1><p>info about português</p></tip></tips> > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-redhat-linux-gnu (64-bit) locale: [1] LC_CTYPE=fr_FR.utf8 LC_NUMERIC=C [3] LC_TIME=fr_FR.utf8 LC_COLLATE=fr_FR.utf8 [5] LC_MONETARY=fr_FR.utf8 LC_MESSAGES=fr_FR.utf8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=fr_FR.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] XML_3.96-1.1 Regards ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.