[Rd] Unicode whitespace

hadley wickham Fri, 04 Jan 2008 10:13:58 -0800

It would be nice if R ignored more unicode white space characters.
For example, if I have  "\u2028" in a command (which I get from a
line-break in keynote) I get the following error:


> qplot(carat, price, data = diamonds,    colour=clarity)
Error: unexpected input in "qplot(carat, price, data = diamonds, ?"

And occasionally have such problems when copying and pasting from
emails as well.

Wikipedia lists the following codepoints as whitespace (I'm sure there
is a more definitive reference but I could not find one with some
quick googling):

U0009-U000D (Control characters, containing TAB, CR and LF)
U0020 SPACE
U0085 NEL
U00A0 NBSP
U1680 OGHAM SPACE MARK
U180E MONGOLIAN VOWEL SEPARATOR
U2000-U200A (different sorts of spaces)
U2028 LSP
U2029 PSP
U202F NARROW NBSP
U205F MEDIUM MATHEMATICAL SPACE
U3000 IDEOGRAPHIC SPACE

would it be possible for R to treat these all in the same way? (Or
does it already but my R is misconfigured?)

Hadley

-- 
http://had.co.nz/
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Unicode whitespace

Reply via email to