https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116872
--- Comment #4 from Michael Kenzel <michael.kenzel at gmail dot com> --- Quoting [lex.name]/1 (https://eel.is/c++draft/lex.name#1.sentence-2): The program is ill-formed if an identifier does not conform to Normalization Form C as specified in the Unicode Standard. According to UAX TR#15 (https://unicode.org/reports/tr15/#Norm_Forms), Normalization Form C is formed by Canonical Decomposition followed by Canonical Composition. According to UnicodeData.txt (https://www.unicode.org/Public/reconstructed/1.0.1/UnicodeData.txt) U+00B5 (MICRO SIGN) decomposes to U+03BC (GREEK SMALL LETTER MU). And U+03BC does not recompose to anything else. So unless I'm missing anything, U+00B5 should not be allowed in an identifier.