Hi Tomas, Sorry for the delayed response. I have tested the problem on the latest R-devel build (2018-04-27 r74651), and it still exists. RGui is always fine with Chinese characters, but some IDEs rely on the CLI version of R (e.g. Visual Studio Code with R plugin).
>Your example print("ABC\u4f60\u597dDEF") is printing two Chinese characters, >right? Yes. U+4F60, U+597D or C4E3, BAC3 in CP936. >Could you reproduce the problem with printing just one of the characters, say >print("ABC\u4f60DEF") ? Yes. The console output is pasted in [ https://paste.ubuntu.com/p/TYgZWhdgXK/ ] (to avoid gibberish in e-mail). The Active Code Page is 936 before and after running Rterm. >As a sanity check - does this display the correct characters in RGui? Yes. >If you take the sequence of the "fputc" commands you captured by the debugger, >and create a trivial console application to just run them - would the >characters display correctly in the same terminal from which you run R.exe? Yes. I created an Win32 Console Application in VS [ https://paste.ubuntu.com/p/h3NFV6nQvs/ ], and all the characters were displayed correctly in two ways. The WriteConsoleA variant uses the current console CP settings, and it should behave like fputc. I guess the Rterm uses its own console I/O mechanism so the 2nd parameter of fputc is not stdout's handle. (I tried to read the source but unable to figure out how it works). The crash in mbcs_get_next, which is also mentioned in the previous post, may be related to this mechanism. If you need further information, please let me know. Thanks, i...@azurefx.name Tomas Kalibera <tomas.kalib...@gmail.com> 2018/4/5 22:42 > > >Thank you for the report and initial debugging. I am not sure what is going >wrong, we may have to rely on your help to debug this (I do not have a system >to reproduce on). A user-targeted advice would be to use RGui (Rgui.exe). > >Does the problem also exist in R-devel? >https://cran.r-project.org/bin/windows/base/rdevel.html > >Your example print("ABC\u4f60\u597dDEF") is printing two Chinese characters, >right? The first one is C4E3 in CP936 (4F60 in Unicode) and the second one is >BAC3 in CP936 (597D in Unicode)? Could you reproduce the problem with printing >just one of the characters, say print("ABC\u4f60DEF") ? > >As a sanity check - does this display the correct characters in RGui? It >should, and does on my system, as RGui uses Unicode internally. By correct I >mean the characters shown e.g. here > >https://msdn.microsoft.com/en-us/library/cc194923.aspx >https://msdn.microsoft.com/en-us/library/cc194920.aspx > >What is the output of "chcp" in the terminal, before you run R.exe? It may be >different from what Sys.getlocale() gives in R. > >If you take the sequence of the "fputc" commands you captured by the debugger, >and create a trivial console application to just run them - would the >characters display correctly in the same terminal from which you run R.exe? > >Thanks >Tomas > > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel