https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224
Neil Booth <neilb at protonmail dot ch> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |neilb at protonmail dot ch --- Comment #38 from Neil Booth <neilb at protonmail dot ch> --- @jsm, I'm curious about your statement: "You need to test cases such as that if a macro is defined twice, once with a UCN in its expansion and once with the equivalent character written in UTF-8, the difference in the expansion is diagnosed (whichever of all the valid UCNs for that character is the one used)." My reading of the standards is that a UCN names a character. A spelling is a sequence of characters. Hence there is no difference in spelling between a UCN naming, say, an emoji and that emoji in the source - the spelling of both is a single character. This is clear in the wording of the C++ standards. e.g. C++23 says "The universal-character-name construct provides a way to name other characters." where is is referring to characters in the translation character set. The wording in the C standards is a little ambiguous but I would be surprised if the intent were different. After all, there is nothing to be gained by preserving source code form differences in the preprocessor or compiler - form differences can only be distinguished when stringized, and there a UCN and the actual character are indeed the same (and IMO always were). Clang seems to do a better job in its UCN implementation because it treats a UCN and the character in names as the same in all ways.