This can be narrowed down to Sys.setlocale("LC_CTYPE","C") x2 <- "\u00e7" x1 <- iconv(x2, from="UTF-8", to="latin1") x1 < x2 # FALSE or NA
In R 4.0 it returns NA, in R-devel it returns FALSE (when running in CP1252 locale on Windows).
It is the same character, only the encoding is different, so the R-devel return value is correct and the previous behavior was a bug. It should not matter what is the current native encoding when doing the comparison. Also, the collation order should only apply after characters are converted to a common encoding, when the encoding is known, so in this case the collation order of the locale should not have an impact, and it seems it doesn't. I don't think R should preserve bug-compatibility in this case, code depending on this buggy behavior should be fixed.
I don't see immediately which NEWS entry this corresponds to. Please keep in mind that NEWS don't cover all changes, for that you need to look at the svn commits, and even then it may be hard to track down concrete changes in behavior to the commits, to do that you need to debug the code or bisect.
Changes to _documented_ behavior should be more visible and of course reflected by changes in the documentation, if not, it is a bug worth reporting, and the report should come with a reference to concrete parts of the documentation that is violated.
Best Tomas On 5/23/20 12:03 PM, Jan Gorecki wrote:
Hi R developers, There seems to be breaking change in base::order on Windows in R-devel. Code below yields different results on R 4.0.0 and R-devel (2020-05-22 r78545). I haven't found any info about that change in NEWS. Was the change intentional? Sys.setlocale("LC_CTYPE","C") Sys.setlocale("LC_COLLATE","C") x1 = "fa\xE7ile" Encoding(x1) = "latin1" x2 = iconv(x1, "latin1", "UTF-8") base::order(c(x2,x1,x1,x2)) Encoding(x2) = "unknown" base::order(c(x2,x1,x1,x2)) # R 4.0.0 base::order(c(x2,x1,x1,x2)) #[1] 1 4 2 3 Encoding(x2) = "unknown" base::order(c(x2,x1,x1,x2)) #[1] 2 3 1 4 # R-devel base::order(c(x2,x1,x1,x2)) #[1] 1 2 3 4 Encoding(x2) = "unknown" base::order(c(x2,x1,x1,x2)) #[1] 1 4 2 3 Best Regards, Jan Gorecki ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel