https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117162
Bug ID: 117162 Summary: Universal character names designating members of the basic character set or control characters should be allowed in string literals or character constants (C23) Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: luigighiron at gmail dot com Target Milestone: --- > A universal character name shall not specify a character whose short > identifier is less than 00A0 other than 0024 ($), 0040 (@), or 0060 (‘), nor > one in the range D800 through DFFF inclusive. Section 6.4.3 "Universal character names" Paragraph 2 ISO/IEC 9899:2018 In C23, this text was updated to: > A universal character name shall not designate a code point where the > hexadecimal value is: > > - in the range D800 through DFFF inclusive; or > - greater than 10FFFF. > > A universal character name outside the c-char sequence of a character > constant, or the s-char sequence of a string literal shall not designate a > control character or a character in the basic character set. Section 6.4.3 "Universal character names" Paragraph 2 N3220 For example, the following program should be valid in C23: int main(){ "\u009F"; } Clang accepts this in C23 mode.