Hello Rolf Turner, On Sat, 3 Jul 2021 14:02:59 +1200 Rolf Turner <r.tur...@auckland.ac.nz> wrote:
> Can anyone suggest how I might get my plot_ascii() function working > again? Basically, it seems to me, the question is: how do I persuade > R to read in "\260" as "\ub0" rather than "\xb0"? Part of the problem is that the "\xb0" byte is not in ASCII, which covers only the lower half of possible 8-bit bytes. I guess that the strings containing bytes with highest bit set used to be interpreted as Latin-1 on your machine, but now get interpreted as UTF-8, which changes their meaning (in UTF-8, the highest bit being set indicates that there will be more bytes to follow, making the string invalid if there is none). The good news is, since it's Latin-1, which is natively supported by R, there are even multiple options: 1. Mark the string as Latin-1 by setting Encoding(a) <- 'latin1' and let R do the re-encoding if and when Pango asks it for a UTF-8-encoded string. 2. Decode Latin-1 into the locale encoding by using iconv(a, 'latin1', '') (or set the third parameter to 'UTF-8', which would give almost the same result on a machine with a UTF-8 locale). The result is, again, a string where Encoding(a) matches the truth. Explicitly setting UTF-8 may be preferable on Windows machines running pre-UCRT builds of R where the locale encoding may not contain all Latin-1 characters, but that's not a problem for you, as far as I know. For any encoding other than Latin-1 or UTF-8, option (2) is still valid. I have verified that your example works on my GNU/Linux system with a UTF-8 locale if I use either option. -- Best regards, Ivan ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.