https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100977
--- Comment #4 from joseph at codesourcery dot com <joseph at codesourcery dot com> --- On Wed, 4 Aug 2021, jakub at gcc dot gnu.org via Gcc-bugs wrote: > plus various changes in the check_nfc function. > So, the first question is if the C11/N11/C99 etc. stuff should use Unicode 4.1 > (or what was used when it was generated) tables and only CXX20/NXX20 should > use > Unicode 13.0 tables (what about NFC/NKC?), or if it is ok to just regenerate > everything using Unicode 13.0 files, add parsing of the > DerivedCoreProperties.txt file too (and pick XID_Start and XID_Continue > properties there, throw away everything < 0x80 and otherwise compute CXX20 > flag > as XID_Continue and NXX20 flag as XID_Continue \ XID_Start. I think it's fine for the normalization tests for older standard versions to use the latest Unicode version, so changing each time we update from newer Unicode data (as per <https://gcc.gnu.org/legacy-ml/gcc-patches/2013-11/msg01901.html> I used Unicode 6.3.0 at that time). A trickier question is whether the XID_Start and XID_Continue sets of characters used for C++23 are meant to be fixed to a particular Unicode version (possibly updated for future C++ versions) or whether the set used for C++23 is meant to be updated for each future Unicode release as it comes out. (Note also that identifiers not in NFC become ill-formed, i.e. -Wnormalized=nfc needs to be a pedwarn for C++23.)
