On Sat, 3 Jul 2021 09:40:28 +0200 Ivan Krylov <krylov.r...@gmail.com> wrote:
> Hello Rolf Turner, > > On Sat, 3 Jul 2021 14:02:59 +1200 > Rolf Turner <r.tur...@auckland.ac.nz> wrote: > > > Can anyone suggest how I might get my plot_ascii() function working > > again? Basically, it seems to me, the question is: how do I > > persuade R to read in "\260" as "\ub0" rather than "\xb0"? > > Part of the problem is that the "\xb0" byte is not in ASCII, which > covers only the lower half of possible 8-bit bytes. I guess that the > strings containing bytes with highest bit set used to be interpreted > as Latin-1 on your machine, but now get interpreted as UTF-8, which > changes their meaning (in UTF-8, the highest bit being set indicates > that there will be more bytes to follow, making the string invalid if > there is none). > > The good news is, since it's Latin-1, which is natively supported by > R, there are even multiple options: > > 1. Mark the string as Latin-1 by setting Encoding(a) <- 'latin1' and > let R do the re-encoding if and when Pango asks it for a UTF-8-encoded > string. > > 2. Decode Latin-1 into the locale encoding by using iconv(a, 'latin1', > '') (or set the third parameter to 'UTF-8', which would give almost > the same result on a machine with a UTF-8 locale). The result is, > again, a string where Encoding(a) matches the truth. Explicitly > setting UTF-8 may be preferable on Windows machines running pre-UCRT > builds of R where the locale encoding may not contain all Latin-1 > characters, but that's not a problem for you, as far as I know. > > For any encoding other than Latin-1 or UTF-8, option (2) is still > valid. > > I have verified that your example works on my GNU/Linux system with a > UTF-8 locale if I use either option. Thanks Ivan. That solves most of the problem, but there are still glitches. I get a plot OK, but a substantial number of the characters are displayed as a wee rectangle containing a 2 x 2 array of digits such as > 0 0 > 8 0 Also note that there is a bit of difference between the results of using Encoding() and the results of using iconv(). E.g. if I do a <- "\x80" b <- iconv(a,"latin1","UTF-8") Encoding(a) <- "latin1" then when I type "a" I get the Euro symbol "€", but when I type "b" I get the string "\u0080". But that doesn't really matter. More problematic is the fact that if I do either plot(0,0,type="n",xlim=c(0,1),ylim=c(0,1),ann=FALSE,axes=FALSE) text(0.5,0.5,labels=a,cex=6) or plot(0,0,type="n",xlim=c(0,1),ylim=c(0,1),ann=FALSE,axes=FALSE) text(0.5,0.5,labels=b,cex=6) then I get wee rectangle with 0 0 8 0 arranged in a 2 x 2 array inside. (Setting cex=6 makes it easier for my ageing eyes to see what the digits are.) Is there any way that I can get the Euro symbol to display correctly in such a graphic? Thanks. cheers, Rolf -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.