On 04/12/2014, 2:00 PM, Richard Cotton wrote: > If I type a character using \U syntax that has more than 4 digits, I > get the wrong character. For example, > > "\U1d4d0" > > should print a mathematical bold script capital A. See > http://www.fileformat.info/info/unicode/char/1d4d0/index.htm > > On my machine, it prints the Hangul character corresponding to > > "\Ud4d0" > http://www.fileformat.info/info/unicode/char/d4d0/index.htm > > It seems that the hex-digit part is overflowing at 16^4. > > I tested this on R3.1.2 and devel (2014-12-03 r67101) x64 under > Windows. I played around with Sys.setlocale and options("encoding"), > but couldn't get the expected value. > > Can others reproduce this? It feels like a bug, but experience tells > me I probably have something silly going on with my setup. >
I see this on Windows, but not on OSX. On Windows: > as.hexmode(utf8ToInt("\U1d4d0")) [1] "d4d0" On OSX: > as.hexmode(utf8ToInt("\U1d4d0")) [1] "1d4d0" I'll see if I can find where the truncation is happening on Windows. Duncan Murdoch ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel