tahonermann added a comment. This looks pretty good to me. I added a few comments. I mostly just reviewed the lexer related code; I didn't dive into the name matching code.
================ Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:130-131 def ext_delimited_escape_sequence : Extension< - "delimited escape sequences are a Clang extension">, + "%select{delimited|named}0 escape sequences are a Clang extension">, InGroup<DiagGroup<"delimited-escape-sequence-extension">>; ---------------- I don't see much value in combining these diagnostics since these are distinct features. The `ext_delimited_escape_sequence` name seems odd for named escape sequences too (even if both features use `{` and `}` as delimiters). ================ Comment at: clang/lib/Lex/Lexer.cpp:3255-3260 + if (!isAlphanumeric(C) && C != '_' && C != '-' && C != ' ') + break; + + if ((C < 'A' || C > 'Z') && !llvm::isDigit(C) && C != ' ' && C != '-') { + Invalid = true; + } ---------------- It isn't clear to me why there are two separate `if` statements here. I would expect the following to suffice. If I'm misunderstanding something, then comments might be helpful. if (!isUppercase(C) && !isDigit(C) && C != '-' && C != ' ') { Invalid = true; break; } I'm not sure why there is a test for '_' since that character is not present in the grammar for `n-char` in P2071. Is it intentional that there is no `break` statement in the second `if` statement? ================ Comment at: clang/lib/Lex/LiteralSupport.cpp:487-495 + auto Res = llvm::sys::unicode::nameToCodepointLooseMatching(Name); + if (!Res) + return false; + Diag(Diags, Features, Loc, TokBegin, TokRangeBegin, TokRangeEnd, + diag::note_invalid_ucn_name_loose_matching) + << FixItHint::CreateReplacement( + MakeCharSourceRange(Features, Loc, TokBegin, TokRangeBegin, ---------------- I like this use of loose matching to generate a fixit! ================ Comment at: clang/test/Lexer/char-escapes-delimited.c:81 + char d = '\N{}'; // expected-error {{delimited escape sequence cannot be empty}} + char e = '\N{'; // expected-error {{delimited escape sequence cannot be empty}} + ---------------- Does this test imply that the `warn_delimited_ucn_incomplete` diagnostic cannot be issued for named escape sequences? I don't see a test that exercises a missing '}' otherwise. Perhaps add a test like: char x = '\N{LOTUS'; Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123064/new/ https://reviews.llvm.org/D123064 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits