https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116872

--- Comment #4 from Michael Kenzel <michael.kenzel at gmail dot com> ---
Quoting [lex.name]/1 (https://eel.is/c++draft/lex.name#1.sentence-2):

The program is ill-formed if an identifier does not conform to Normalization
Form C as specified in the Unicode Standard.

According to UAX TR#15 (https://unicode.org/reports/tr15/#Norm_Forms),
Normalization Form C is formed by Canonical Decomposition followed by Canonical
Composition.

According to UnicodeData.txt
(https://www.unicode.org/Public/reconstructed/1.0.1/UnicodeData.txt) U+00B5
(MICRO SIGN) decomposes to U+03BC (GREEK SMALL LETTER MU). And U+03BC does not
recompose to anything else.

So unless I'm missing anything, U+00B5 should not be allowed in an identifier.

Reply via email to