[Bug c++/100977] [C++23] Implement C++ Identifier Syntax using Unicode Standard Annex 31

joseph at codesourcery dot com via Gcc-bugs Wed, 04 Aug 2021 11:35:09 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100977

--- Comment #4 from joseph at codesourcery dot com <joseph at codesourcery dot 
com> ---
On Wed, 4 Aug 2021, jakub at gcc dot gnu.org via Gcc-bugs wrote:

> plus various changes in the check_nfc function.
> So, the first question is if the C11/N11/C99 etc. stuff should use Unicode 4.1
> (or what was used when it was generated) tables and only CXX20/NXX20 should 
> use
> Unicode 13.0 tables (what about NFC/NKC?), or if it is ok to just regenerate
> everything using Unicode 13.0 files, add parsing of the
> DerivedCoreProperties.txt file too (and pick XID_Start and XID_Continue
> properties there, throw away everything < 0x80 and otherwise compute CXX20 
> flag
> as XID_Continue and NXX20 flag as XID_Continue \ XID_Start.

I think it's fine for the normalization tests for older standard versions 
to use the latest Unicode version, so changing each time we update from 
newer Unicode data (as per 
<https://gcc.gnu.org/legacy-ml/gcc-patches/2013-11/msg01901.html> I used 
Unicode 6.3.0 at that time).

A trickier question is whether the XID_Start and XID_Continue sets of 
characters used for C++23 are meant to be fixed to a particular Unicode 
version (possibly updated for future C++ versions) or whether the set used 
for C++23 is meant to be updated for each future Unicode release as it 
comes out.

(Note also that identifiers not in NFC become ill-formed, i.e. 
-Wnormalized=nfc needs to be a pedwarn for C++23.)

[Bug c++/100977] [C++23] Implement C++ Identifier Syntax using Unicode Standard Annex 31

Reply via email to