Bruno Haible via Gnulib discussion list <[email protected]> writes:

> [Redirecting to bug-gnulib. This is a question about 'mcel' from gnulib.]

Yes, but I was also thinking that maybe Coreutils wanted to handle
normalization.

> Collin Funk wrote in
> <https://lists.gnu.org/archive/html/coreutils/2025-08/msg00066.html>:
>> I noticed that mcel does not see the following characters as equal in a
>> UTF-8 locale:
>>
>>    è (U+0065 + U+0300)
>>    è (U+00E8)
>>
>> This is because mcel_isbasic (U+0065) sees an ASCII character and does
>> not normalize it using the following U+0300.
>>
>> Is this intentional or not?
>
> Yes, it is intentional. The reason is that Unicode text that is exchanged
> between programs is supposed to be in NFC normalization form [1].

Thanks, that makes life much easier.

> For many years, one exception to this rule were file names on macOS HFS+
> file systems, which are in NFD. This caused lots of trouble with non-ASCII
> file names on macOS. But fortunately, Apple has phased out HFS+.

I have never used HFS+, but it looks like it has other problems as well.
Such as nanosecond timestamps and, more importantly, support for dates
past 2040 [1]. So I don't think we need to worry about that file system.

Collin

[1] https://en.wikipedia.org/wiki/HFS_Plus#Criticisms

Reply via email to