tahonermann accepted this revision.
tahonermann added a comment.

Looks good to me! Thank you for filing the separate issue.



================
Comment at: clang/test/Lexer/utf8-char-literal.cpp:23
+char f = u8'ab';            // expected-error {{Unicode character literals may 
not contain multiple characters}}
+char g = u8'\x80';          // expected-warning {{implicit conversion from 
'int' to 'char' changes value from 128 to -128}}
 #endif
----------------
aaron.ballman wrote:
> tahonermann wrote:
> > aaron.ballman wrote:
> > > One more test I'd like to see added, just to make sure we're covering 
> > > 6.4.4.4p9 properly:
> > > ```
> > > _Static_assert(
> > >   _Generic(u8'a',
> > >            default: 0,
> > >            unsigned char : 1),
> > >   "Surprise!");  
> > > ```
> > > We expect the type of a u8 character literal to be `unsigned char` at the 
> > > moment, which is different from a u8 string literal, which uses `char`.
> > > 
> > > However, WG14 is also going to be considering 
> > > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm for C2x at our 
> > > meeting next week.
> > Good suggestion. I believe the following update will be needed 
> > to`Sema::ActOnCharacterConstant()` in `clang/lib/Sema/SemaExpr.cpp`:
> >   ...
> >   else if (Literal.isUTF8() && getLangOpts().C2x)
> >     Ty = Context.UnsignedCharTy; // u8'x' -> unsigned char in c2x.
> >   else if Literal.isUTF8() && getLangOpts().Char8)
> >     Ty = Context.Char8Ty; // u8'x' -> char8_t when it exists.
> >   ...
> > 
> > However, WG14 is also going to be considering 
> > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm for C2x at our 
> > meeting next week.
> 
> I have an update on this. We discussed the paper and took a straw poll:
> ```
> Does WG14 wish to adopt N2653 in C23? 18/0/2 (consensus)
> ```
> So we should make sure that we all agree this patch is in line with the 
> changes from that paper. I believe your changes agree, but it'd be nice for 
> @tahonermann to confirm.
Confirmed. N2653 technically changes the type of `u8` character literals to 
`char8_t`, but since that is just a typedef of `unsigned char`, these changes 
still align with the semantic intent. Ideally, we would maybe try to reflect 
the typedef, but 1) the typedef isn't necessarily available, 2) Clang doesn't 
do similarly for any of the other character (or string) literals, and 3) no one 
is likely to care anyway.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119221/new/

https://reviews.llvm.org/D119221

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to