Hi Paul,

> I do have a qualm in that coreutils (and I assume others) interpret 
> !hard_locale
> (LC_COLLATE) as meaning that the locale is unibyte and uses native byte
> comparison.

Isn't this warranted by section "LC_COLLATE Category in the POSIX Locale" in
<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html> ?

> As I recall on some platforms (macOS maybe?), the C locale uses
> UTF-8 so this interpretation isn't correct.

UTF-8 has the nice property that byte-per-byte comparison and codepoint-per-
codepoint comparison are equivalent. If the encoding was not UTF-8, but
e.g. GB18030, I would agree that there is a problem. But there is no C
locale with GB18030 encoding on any platform.

Bruno


Reply via email to