On Tue, Feb 4, 2014 at 4:18 PM, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > > On 14-02-04 5:49 AM, Majid Einian wrote: >> >> Dear R Helpers, >> >> See the Code: >> >> a <- intToUtf8(1777) >> show(a) >> zz <- file(description="test.txt",open="w",encoding="UTF-8") >> cat(a, file = zz) >> close(zz) >> >> in a Unicode aware environment (such as RGui console or RStudio Console) >> you will see this as output: >> >> [1] "۱" >> >> >> but the character is not written correctly in the file test.txt (which is >> encoded in UTF-8 without BOM) : >> >> <U+06F1> >> >> The problem seems to be this: R changes text to the locale of system (for >> me this is Arabic Windows (Codepage 1256) that does not have a relevant >> code for U+06F1, then changes it back to UTF-8 and writes it into file. >> What do I miss here? >> How can I write a Unicode string into a text file correctly? > > > There are a lot of places in R where it converts strings to the local > encoding, perhaps too many. On the other hand, maybe Windows should be > offering UTF-8 locales by now.
I would like to see that happen too! I have no such problem on Linux. > > I haven't tested in your locale, but I believe writeLines() to a connection > declared to be in a UTF-8 encoding will maintain the encoding. writeLines() does change the encoding to system encoding and then back to unicode just like cat(). > You can declare a file to be in encoding "UTF-8-BOM" if you want to ignore a > BOM on input; I forget whether it will write one on output. If it doesn't, > you can always write one explicitly. > I have no problem with BOM being there or not. > I was hoping to make some progress on this before R 3.1.0 so that more cases > of writing strings to UTF-8 files would work, but time is running out. I hope we see this happen soon :) Majid Einian > > Duncan Murdoch > >> >> >> Majid Einian, >> Economics Researcher, Monetary and Banking Research Institute, Central Bank >> of Islamic Republic of Iran, Tehran, IRAN >> and >> PhD Candidate in "Economics", Graduate School of Management and >> Economics, Sharif University of Technology, Tehran, IRAN >> >> [[alternative HTML version deleted]] >> >> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.