Thanks, Jan.  Unfortunately I have huge streams of data to transmit and it 
should be mostly human readable, too, so escape encoding the entire string 
isn't an option.
A workaround is that I found I can use the rjson package as in

> cat(toJSON(fromJSON("\"Unicode char: \ufffd\"")))
"Unicode char: \ufffd"

but this seems like an awful hack and suggests that there really is a better 
way to escape extended characters.


> From: Jan T Kim <jtt...@googlemail.com>
> Subject: Re: [R] Writing escaped unicode
> Date: December 11, 2012 5:49:18 AM EST
> To: r-help@r-project.org
> 
> 
> On Mon, Dec 10, 2012 at 11:46:40PM -0500, David Kulp wrote:
>> I'd like to write unicode strings using the "\u" escape syntax.  According 
>> to the documentation, print.default or encodeString will escape unicode 
>> using the \u convention.  In practice, I can't make it work.
>> 
>>> b="Unicode character: \ufffd"
>>> print.default(b)
>> [1] "Unicode character: ???"
>>> encodeString(b)
>> [1] "Unicode character: ???"
>> 
>> I want to write the string back out in the same escape formatting as I read 
>> it in.  This is because I'm interfacing with some Ruby code that requires 
>> unicode to be in this escaped format.
> 
> as I read the documentation, encodeString escapes control characters,
> but not "unicode characters". The notion of a "unicode character" is
> not entirely well defined, considering that the very mission of the
> unicode consortium is to make sure that there are no non-unicode
> characters...  ;-)
> 
>> From this it follows that replacing all characters with their \uxxxx
> representation, e.g. by
> 
>    paste(sprintf("\\u%04x", utf8ToInt(b)), collapse = "");
> 
> should work with the Ruby client you try to talk to. Obviously, this
> bloats the string rather more than necessary (particularly if most of
> the characters are in the ASCII range), but if the volume you're
> piping into the client is small, this may be good enough.
> 
> Best regards, Jan
> -- 
> +- Jan T. Kim -------------------------------------------------------+
> |             email: jtt...@gmail.com                                |
> |             WWW:   http://www.jtkim.dreamhosters.com/              |
> *-----=<  hierarchical systems are for files, not for humans  >=-----*
> 


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to