https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117162
Bug ID: 117162
Summary: Universal character names designating members of the
basic character set or control characters should be
allowed in string literals or character constants
(C23)
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: luigighiron at gmail dot com
Target Milestone: ---
> A universal character name shall not specify a character whose short
> identifier is less than 00A0 other than 0024 ($), 0040 (@), or 0060 (‘), nor
> one in the range D800 through DFFF inclusive.
Section 6.4.3 "Universal character names" Paragraph 2 ISO/IEC 9899:2018
In C23, this text was updated to:
> A universal character name shall not designate a code point where the
> hexadecimal value is:
>
> - in the range D800 through DFFF inclusive; or
> - greater than 10FFFF.
>
> A universal character name outside the c-char sequence of a character
> constant, or the s-char sequence of a string literal shall not designate a
> control character or a character in the basic character set.
Section 6.4.3 "Universal character names" Paragraph 2 N3220
For example, the following program should be valid in C23:
int main(){
"\u009F";
}
Clang accepts this in C23 mode.