On Sun, Feb 04, 2024 at 12:07:17PM +0100, Andreas Schwab wrote: > On Feb 04 2024, Eli Zaretskii wrote: > > > If we want collation which uses only codepoints, disregarding any > > collation weights defined by the Unicode TR10, we could use > > en_US.utf-8, but then, as Gavin says, using glibc collation function > > you get more than you asked, because weights are not ignored. So we > > need to use something else in the C variant of collation code, AFAIU. > > If you want pure codepoint order, you can just use strcmp on the UTF-8 > encoded strings.
We already do that if USE_UNICODE_COLLATION is set to 0, or as a fallback if Unicode::Collate is not found in Perl, and as a fallback in C if strxfrm_l or newlocale are not present and if the call to newlocale (LC_COLLATE_MASK, "en_US.utf-8", 0) fails. -- Pat