Hi Tomas,

Sorry for the delayed response. I have tested the problem on the latest R-devel 
build (2018-04-27 r74651), and it still exists. RGui is always fine with 
Chinese characters, but some IDEs rely on the CLI version of R (e.g. Visual 
Studio Code with R plugin).

>Your example  print("ABC\u4f60\u597dDEF") is printing two Chinese characters, 
>right?
Yes. U+4F60, U+597D or C4E3, BAC3 in CP936.

>Could you reproduce the problem with printing just one of the characters, say 
>print("ABC\u4f60DEF") ?
Yes. The console output is pasted in [ https://paste.ubuntu.com/p/TYgZWhdgXK/ ] 
(to avoid gibberish in e-mail).
The Active Code Page is 936 before and after running Rterm.

>As a sanity check - does this display the correct characters in RGui?
Yes.

>If you take the sequence of the "fputc" commands you captured by the debugger, 
>and create a trivial console application to just run them - would the 
>characters display correctly in the same terminal from which you run R.exe?
Yes. I created an Win32 Console Application in VS [ 
https://paste.ubuntu.com/p/h3NFV6nQvs/ ], and all the characters were displayed 
correctly in two ways. The WriteConsoleA variant uses the current console CP 
settings, and it should behave like fputc. 

I guess the Rterm uses its own console I/O mechanism so the 2nd parameter of 
fputc is not stdout's handle. (I tried to read the source but unable to figure 
out how it works). The crash in mbcs_get_next, which is also mentioned in the 
previous post, may be related to this mechanism.

If you need further information, please let me know.

Thanks,
i...@azurefx.name


Tomas Kalibera <tomas.kalib...@gmail.com> 2018/4/5 22:42
> 
>
>Thank you for the report and initial debugging. I am not sure what is going 
>wrong, we may have to rely on your help to debug this (I do not have a system 
>to reproduce on). A user-targeted advice would be to use RGui (Rgui.exe).
>
>Does the problem also exist in R-devel?
>https://cran.r-project.org/bin/windows/base/rdevel.html
>
>Your example  print("ABC\u4f60\u597dDEF") is printing two Chinese characters, 
>right? The first one is C4E3 in CP936 (4F60 in Unicode) and the second one is 
>BAC3 in CP936 (597D in Unicode)? Could you reproduce the problem with printing 
>just one of the characters, say print("ABC\u4f60DEF") ?
>
>As a sanity check - does this display the correct characters in RGui? It 
>should, and does on my system, as RGui uses Unicode internally. By correct I 
>mean the characters shown e.g. here
>
>https://msdn.microsoft.com/en-us/library/cc194923.aspx
>https://msdn.microsoft.com/en-us/library/cc194920.aspx
>
>What is the output of "chcp" in the terminal, before you run R.exe? It may be 
>different from what Sys.getlocale() gives in R.
>
>If you take the sequence of the "fputc" commands you captured by the debugger, 
>and create a trivial console application to just run them - would the 
>characters display correctly in the same terminal from which you run R.exe?
>
>Thanks
>Tomas
>
>
>
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to