SebastianPerta added a comment.

>> Additionally, the type of a character constant in C is int.

This means that char32_t c4 = U'\U00064321'; is invalid in C. I know that is 
clang more strict with the standard than GCC, however I would like to mention 
that in GCC the value is not truncated to 16 bit which is I found this problem 
originally. I suppose we want to stick with the standard in clang.

>> My reading of https://eel.is/c++draft/lex.ccon#2 is that a multi-char char 
>> literal with a L/u8/u/U prefix is not int but the respective character types

As explained by @tahonermann is just in case of C in case of C++ literals have 
their respective character types:
I checked char8_t, char16_t and char32_t with u8,u and U respectively and the 
following line of code by @tahonermann works in all 3 cases.
unsigned BitWidth = getCharWidth(Kind, PP.getTargetInfo());
Since Kind will be utf8_char_constant, utf16_char_constant and 
utf32_char_constant respectively.
And since L is not supported I think all cases are accounted for. 
Or am I missing something?
In case not, should I continue to put another patch together with suggested 
changes?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127363/new/

https://reviews.llvm.org/D127363

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D127363: [Lex] Fix... Sebastian Perta via Phabricator via cfe-commits

Reply via email to