SebastianPerta added a comment. >> Additionally, the type of a character constant in C is int.
This means that char32_t c4 = U'\U00064321'; is invalid in C. I know that is clang more strict with the standard than GCC, however I would like to mention that in GCC the value is not truncated to 16 bit which is I found this problem originally. I suppose we want to stick with the standard in clang. >> My reading of https://eel.is/c++draft/lex.ccon#2 is that a multi-char char >> literal with a L/u8/u/U prefix is not int but the respective character types As explained by @tahonermann is just in case of C in case of C++ literals have their respective character types: I checked char8_t, char16_t and char32_t with u8,u and U respectively and the following line of code by @tahonermann works in all 3 cases. unsigned BitWidth = getCharWidth(Kind, PP.getTargetInfo()); Since Kind will be utf8_char_constant, utf16_char_constant and utf32_char_constant respectively. And since L is not supported I think all cases are accounted for. Or am I missing something? In case not, should I continue to put another patch together with suggested changes? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D127363/new/ https://reviews.llvm.org/D127363 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits