Hi Paul, > I do have a qualm in that coreutils (and I assume others) interpret > !hard_locale > (LC_COLLATE) as meaning that the locale is unibyte and uses native byte > comparison.
Isn't this warranted by section "LC_COLLATE Category in the POSIX Locale" in <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html> ? > As I recall on some platforms (macOS maybe?), the C locale uses > UTF-8 so this interpretation isn't correct. UTF-8 has the nice property that byte-per-byte comparison and codepoint-per- codepoint comparison are equivalent. If the encoding was not UTF-8, but e.g. GB18030, I would agree that there is a problem. But there is no C locale with GB18030 encoding on any platform. Bruno