On 14-02-04 5:49 AM, Majid Einian wrote:
Dear R Helpers,

See the Code:

a <- intToUtf8(1777)
show(a)
zz <- file(description="test.txt",open="w",encoding="UTF-8")
cat(a, file = zz)
close(zz)

in a Unicode aware environment (such as RGui console or RStudio Console)
you will see this as output:

[1] "Û±"


but the character is not written correctly in the file test.txt (which is
encoded in UTF-8 without BOM) :

<U+06F1>

The problem seems to be this: R changes text to the locale of system (for
me this is Arabic Windows (Codepage 1256) that does not have a relevant
code for U+06F1, then changes it back to UTF-8 and writes it into file.
What do I miss here?
  How can I write a Unicode string into a text file correctly?

There are a lot of places in R where it converts strings to the local encoding, perhaps too many. On the other hand, maybe Windows should be offering UTF-8 locales by now.

I haven't tested in your locale, but I believe writeLines() to a connection declared to be in a UTF-8 encoding will maintain the encoding. You can declare a file to be in encoding "UTF-8-BOM" if you want to ignore a BOM on input; I forget whether it will write one on output. If it doesn't, you can always write one explicitly.

I was hoping to make some progress on this before R 3.1.0 so that more cases of writing strings to UTF-8 files would work, but time is running out.

Duncan Murdoch



Majid Einian,
Economics Researcher, Monetary and Banking Research Institute, Central Bank
of Islamic Republic of Iran, Tehran, IRAN
and
PhD Candidate in "Economics", Graduate School of Management and
Economics, Sharif University of Technology, Tehran, IRAN

        [[alternative HTML version deleted]]



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to