Bruno Haible via Gnulib discussion list <[email protected]> writes: > [Redirecting to bug-gnulib. This is a question about 'mcel' from gnulib.]
Yes, but I was also thinking that maybe Coreutils wanted to handle normalization. > Collin Funk wrote in > <https://lists.gnu.org/archive/html/coreutils/2025-08/msg00066.html>: >> I noticed that mcel does not see the following characters as equal in a >> UTF-8 locale: >> >> è (U+0065 + U+0300) >> è (U+00E8) >> >> This is because mcel_isbasic (U+0065) sees an ASCII character and does >> not normalize it using the following U+0300. >> >> Is this intentional or not? > > Yes, it is intentional. The reason is that Unicode text that is exchanged > between programs is supposed to be in NFC normalization form [1]. Thanks, that makes life much easier. > For many years, one exception to this rule were file names on macOS HFS+ > file systems, which are in NFD. This caused lots of trouble with non-ASCII > file names on macOS. But fortunately, Apple has phased out HFS+. I have never used HFS+, but it looks like it has other problems as well. Such as nanosecond timestamps and, more importantly, support for dates past 2040 [1]. So I don't think we need to worry about that file system. Collin [1] https://en.wikipedia.org/wiki/HFS_Plus#Criticisms
