aaron.ballman created this revision. aaron.ballman added reviewers: tahonermann, cor3ntin, clang-language-wg. Herald added a project: All. aaron.ballman requested review of this revision. Herald added a project: clang.
WG21 P1949 <https://reviews.llvm.org/P1949> and WG14 N2836 were both adopted without any deprecation period for users who were making use of now-invalid characters in their identifiers. This caused some amount of pain for people, especially folks using mathematical symbols in their identifiers, which makes it hard to upgrade to newer Clang versions with this diagnostic as an error. This patch downgrades the error to be a warning which defaults to an error. This continues to signal to users that the identifiers are in fact invalid, but it gives a grace period for people to update their code bases to the new rules (or, alternatively, time for the Unicode consortium to determine whether some of these characters should be allowed in an identifier in the immutable set). I am tentatively documenting that grace period as being until we ship Clang 18 (which should give users a year or two to migrate their code). Note, because we implemented the UAX31 rules in Clang 14, I would like to try to land this patch and backport it to Clang 15 so that we avoid having *two* releases with no easy migration path for this subset of users. I think this is reasonable because we're not altering the implementation logic at all, just modifying what diagnostic kind is emitted by default. If someone has a concern about wanting to cherry pick this so late in the cycle, please share those concerns ASAP. Fixes #54732 Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D132877 Files: clang/docs/ReleaseNotes.rst clang/include/clang/Basic/DiagnosticLexKinds.td clang/test/Preprocessor/utf8-allowed-chars.c
Index: clang/test/Preprocessor/utf8-allowed-chars.c =================================================================== --- clang/test/Preprocessor/utf8-allowed-chars.c +++ clang/test/Preprocessor/utf8-allowed-chars.c @@ -1,58 +1,32 @@ -// RUN: %clang_cc1 %s -fsyntax-only -std=c99 -verify -// RUN: %clang_cc1 %s -fsyntax-only -std=c11 -Wc99-compat -verify -// RUN: %clang_cc1 %s -fsyntax-only -x c++ -std=c++03 -Wc++11-compat -verify -// RUN: %clang_cc1 %s -fsyntax-only -x c++ -std=c++11 -Wc++98-compat -verify +// RUN: %clang_cc1 %s -fsyntax-only -std=c99 -verify=expected,c99 +// RUN: %clang_cc1 %s -fsyntax-only -std=c11 -Wc99-compat -verify=expected,c11 +// RUN: %clang_cc1 %s -fsyntax-only -x c++ -std=c++03 -Wc++11-compat -verify=expected,cpp +// RUN: %clang_cc1 %s -fsyntax-only -x c++ -std=c++11 -Wc++98-compat -verify=expected,cpp +// RUN: %clang_cc1 %s -fsyntax-only -Wno-invalid-identifier-character -std=c2x -verify=off +// RUN: %clang_cc1 %s -fsyntax-only -Wno-invalid-identifier-character -x c++ -std=c++2b -verify=off // Note: This file contains Unicode characters; please do not remove them! // Identifier characters -extern char aǶ; // C11, C++11 +extern char aǶ; // c11-warning {{using this character in an identifier is incompatible with C99}} \ + c99-error {{not allowed in an identifier}} extern char aª; // C99, C11, C++11 -extern char aÎ; // C++03, C11, C++11 +extern char aÎ; // cpp-error {{not allowed in an identifier}} \ + c11-warning {{using this character in an identifier is incompatible with C99}} \ + c99-error {{not allowed in an identifier}} extern char aà¹; // C99, C++03, C11, C++11 -extern char aï¹ ; // none -extern char xÌ; // C11, C++11. Note that this does not have a composed form. - - +extern char aï¹ ; // expected-error {{not allowed in an identifier}} +// Note that this does not have a composed form. +extern char xÌ; // c11-warning {{using this character in an identifier is incompatible with C99}} \ + c99-error {{not allowed in an identifier}} // Identifier initial characters -extern char à¹; // C++03, C11, C++11 -extern char Ì; // disallowed initially in C11/C++, always in C99 - - - - -#if __cplusplus -// expected-error@11 {{not allowed in an identifier}} -// expected-error@13 {{not allowed in an identifier}} -// expected-error@20 {{character <U+0E50> not allowed at the start of an identifier}} -// expected-error@21 {{character <U+0300> not allowed at the start of an identifier}} -// expected-warning@20 {{declaration does not declare anything}} -// expected-warning@21 {{declaration does not declare anything}} - -#else -# if __STDC_VERSION__ >= 201112L -// C11 -// expected-warning@9 {{using this character in an identifier is incompatible with C99}} -// expected-warning@11 {{using this character in an identifier is incompatible with C99}} -// expected-error@13 {{not allowed in an identifier}} -// expected-warning@14 {{using this character in an identifier is incompatible with C99}} -// expected-warning@20 {{starting an identifier with this character is incompatible with C99}} -// expected-warning@21 {{declaration does not declare anything}} -// expected-error@21 {{not allowed at the start of an identifier}} - -# else -// C99 -// expected-error@9 {{not allowed in an identifier}} -// expected-error@11 {{not allowed in an identifier}} -// expected-error@13 {{not allowed in an identifier}} -// expected-error@14 {{not allowed in an identifier}} -// expected-error@20 {{character <U+0E50> not allowed at the start of an identifier}} -// expected-error@21 {{unexpected character <U+0300>}} -// expected-warning@20 {{declaration does not declare anything}} -// expected-warning@21 {{declaration does not declare anything}} - - -# endif -#endif +extern char à¹a; // cpp-error {{character <U+0E50> not allowed at the start of an identifier}} \ + c11-warning {{starting an identifier with this character is incompatible with C99}} \ + c99-error {{character <U+0E50> not allowed at the start of an identifier}} +extern char Ì; // cpp-error {{character <U+0300> not allowed at the start of an identifier}} \ + c11-error {{character <U+0300> not allowed at the start of an identifier}} \ + c99-error {{unexpected character <U+0300>}} \ + expected-warning {{declaration does not declare anything}} \ + off-warning {{declaration does not declare anything}} Index: clang/include/clang/Basic/DiagnosticLexKinds.td =================================================================== --- clang/include/clang/Basic/DiagnosticLexKinds.td +++ clang/include/clang/Basic/DiagnosticLexKinds.td @@ -121,8 +121,13 @@ "invalid UTF-8 in comment">, InGroup<DiagGroup<"invalid-utf8">>; def err_character_not_allowed : Error< "unexpected character <U+%0>">; -def err_character_not_allowed_identifier : Error< - "character <U+%0> not allowed %select{in|at the start of}1 an identifier">; +// This warning defaults to an error because there was no deprecation period +// in either C or C++ for invalid UAX31 characters; we'll let users downgrade +// this error for a few releases, but we anticipate turning this back into a +// hard error by Clang 18. +def err_character_not_allowed_identifier : Warning< + "character <U+%0> not allowed %select{in|at the start of}1 an identifier">, + InGroup<DiagGroup<"invalid-identifier-character">>, DefaultError; def ext_unicode_whitespace : ExtWarn< "treating Unicode character as whitespace">, InGroup<DiagGroup<"unicode-whitespace">>; Index: clang/docs/ReleaseNotes.rst =================================================================== --- clang/docs/ReleaseNotes.rst +++ clang/docs/ReleaseNotes.rst @@ -116,6 +116,16 @@ - Correctly diagnose a future keyword if it exist as a keyword in the higher language version and specifies in which version it will be a keyword. This supports both c and c++ language. +- Added the ``-Winvalid-identifier-character`` warning group to identify + identifier characters which are invalid according to UAX31 but were + previously allowed. This warning defaults to an error because these + characters are invalid according to both the C and C++ language + specifications, but because there was no deprecation period for those + characters, we allow users to downgrade the error back into a warning with + ``-Wno-error=invalid-identifier-character``. It is possible to disable the + warning entirely with ``-Wno-invalid-identifier-character`` but this is not + advised because we intend to turn the warning back into a hard error in Clang + 18 after giving users a chance to update their code. Non-comprehensive list of changes in this release -------------------------------------------------
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits