[Redirecting to bug-gnulib. This is a question about 'mcel' from gnulib.] Collin Funk wrote in <https://lists.gnu.org/archive/html/coreutils/2025-08/msg00066.html>: > I noticed that mcel does not see the following characters as equal in a > UTF-8 locale: > > è (U+0065 + U+0300) > è (U+00E8) > > This is because mcel_isbasic (U+0065) sees an ASCII character and does > not normalize it using the following U+0300. > > Is this intentional or not?
Yes, it is intentional. The reason is that Unicode text that is exchanged between programs is supposed to be in NFC normalization form [1]. For many years, one exception to this rule were file names on macOS HFS+ file systems, which are in NFD. This caused lots of trouble with non-ASCII file names on macOS. But fortunately, Apple has phased out HFS+. See also [2]. Bruno [1] https://www.unicode.org/faq/normalization.html#2 [2] https://www.w3.org/TR/charmod-norm/
