Re: [PATCH 2/9]: C++ P0482R5 char8_t: Core language support
On 12/3/18 5:01 PM, Jason Merrill wrote: On 12/3/18 4:51 PM, Jason Merrill wrote: On 11/5/18 2:39 PM, Tom Honermann wrote: This patch adds support for the P0482R5 core language changes. This includes: - The -fchar8_t and -fno_char8_t command line options. - char8_t as a keyword. - The char8_t builtin type as a non-aliasing unsigned integral character type of size 1. - Use of char8_t as a simple type specifier. - u8 character literals with type char8_t. - u8 string literals with type array of const char8_t. - User defined literal operators that accept char8_1 and char8_t pointer types. - New __cpp_char8_t predefined feature test macro. - New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macros . - Name mangling and demangling for char8_t (using Du). gcc/ChangeLog: 2018-11-04 Tom Honermann * defaults.h: Define CHAR8_TYPE. gcc/c-family/ChangeLog: 2018-11-04 Tom Honermann * c-family/c-common.c (c_common_reswords): Add char8_t. (fix_string_type): Use char8_t for the type of u8 string literals. (c_common_get_alias_set): char8_t doesn't alias. (c_common_nodes_and_builtins): Define char8_t as a builtin type in C++. (c_stddef_cpp_builtins): Add __CHAR8_TYPE__. (keyword_begins_type_specifier): Add RID_CHAR8. * gcc/c-family/c-common.h (rid): Add RID_CHAR8. (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE. Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS. Define char8_type_node and char8_array_type_node. * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine __GCC_ATOMIC_CHAR8_T_LOCK_FREE. (c_cpp_builtins): Predefine __cpp_char8_t. * c-family/c-lex.c (lex_string): Use char8_array_type_node as the type of CPP_UTF8STRING. (lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR. * c-family/c.opt: Add the -fchar8_t command line option. gcc/c/ChangeLog: 2018-11-04 Tom Honermann * c/c-typeck.c (char_type_p): Add char8_type_node. (digest_init): Handle initialization by a u8 string literal of char8_t type. gcc/cp/ChangeLog: 2018-11-04 Tom Honermann * cp/cvt.c (type_promotes_to): Handle char8_t promotion. * cp/decl.c (grokdeclarator): Handle invalid type specifier combinations involving char8_t. * cp/lex.c (init_reswords): Add char8_t as a reserved word. * cp/mangle.c (write_builtin_type): Add name mangling for char8_t (Du). * cp/parser.c (cp_keyword_starts_decl_specifier_p, cp_parser_simple_type_specifier): Recognize char8_t as a simple type specifier. (cp_parser_string_literal): Use char8_array_type_node for the type of CPP_UTF8STRING. (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in system headers. * cp/rtti.c (emit_support_tinfos): type_info support for char8_t. * cp/tree.c (char_type_p): Recognize char8_t as a character type. * cp/typeck.c (string_conv_p): Handle conversions of u8 string literals of char8_t type. (check_literal_operator_args): Handle UDLs with u8 string literals of char8_t type. * cp/typeck2.c (digest_init_r): Disallow initializing a char array with a u8 string literal. libiberty/ChangeLog: 2018-10-31 Tom Honermann * cp-demangle.c (cplus_demangle_builtin_types, cplus_demangle_type): Add name demangling for char8_t (Du). * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate the new char8_t type. @@ -3543,6 +3556,10 @@ c_common_get_alias_set (tree t) if (!TYPE_P (t)) return -1; + /* Unlike char, char8_t doesn't alias. */ + if (flag_char8_t && t == char8_type_node) + return -1; This seems unnecessary; doesn't the existing code have the same effect? I think we could do with just an adjustment to the existing comment. I'm not sure. I had concerns about unintended matching due to char8_t having an underlying type of unsigned char. + else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node) + || (flag_char8_t && type == char8_type_node) + bool char8_array = (flag_char8_t && !!comptypes (typ1, char8_type_node)); + || (flag_char8_t && type == char8_type_node In many places you check the flag and then for one of the char8 types. Since the types won't be used without the flag, checking the flag seems redundant? This was again protection against unintended matching of the underlying unsigned char type, particularly when compiling as C. char8_type_node is constructed (in c_common_nodes_and_builtins) following the pattern in place for char16_t and char32_t with the following code: + char8_type_node = get_identifier (CHAR8_TYPE); + char8_type_node = TREE_TYPE (identifier_global_value (char8_type_node)); + char8_type_size = TYPE_PRECISION (char8_type_node); + if (c_dialect_cxx ()) +{ + ch
Re: [PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates
On 12/3/18 2:59 PM, Jason Merrill wrote: On 11/5/18 2:39 PM, Tom Honermann wrote: This patch adds documentation for new -fchar8_t and -fno-char8_t options. gcc/ChangeLog: 2018-11-04 Tom Honermann * doc/invoke.texi (-fchar8_t): Document new option. +Enable support for the P0482 proposal including the addition of a +new @code{char8_t} fundamental type, changes to the types of UTF-8 Now that the proposal has been accepted, I'd refer to C++2a instead. Agreed. I also need to make the changes to implicitly enable -fchar8_t with -std=c++2a. The list of impacted standard library features was incomplete and I suspect it isn't worth mentioning them specifically. Perhaps mentioning the feature test macros would be helpful as well? How does the following sound? Enable support for @code{char8_t} as adopted for C++2a. This includes the addition of a new @code{char8_t} fundamental type, changes to the types of UTF-8 string and character literals, new signatures for user defined literals, associated standard library updates, and new @code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature test macros. Tom. Jason
Re: [PATCH 2/9]: C++ P0482R5 char8_t: Core language support
On 12/17/18 4:02 PM, Jason Merrill wrote: On 12/5/18 11:16 AM, Jason Merrill wrote: On 12/5/18 2:09 AM, Tom Honermann wrote: On 12/3/18 5:01 PM, Jason Merrill wrote: On 12/3/18 4:51 PM, Jason Merrill wrote: On 11/5/18 2:39 PM, Tom Honermann wrote: This patch adds support for the P0482R5 core language changes. This includes: - The -fchar8_t and -fno_char8_t command line options. - char8_t as a keyword. - The char8_t builtin type as a non-aliasing unsigned integral character type of size 1. - Use of char8_t as a simple type specifier. - u8 character literals with type char8_t. - u8 string literals with type array of const char8_t. - User defined literal operators that accept char8_1 and char8_t pointer types. - New __cpp_char8_t predefined feature test macro. - New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macros . - Name mangling and demangling for char8_t (using Du). gcc/ChangeLog: 2018-11-04 Tom Honermann * defaults.h: Define CHAR8_TYPE. gcc/c-family/ChangeLog: 2018-11-04 Tom Honermann * c-family/c-common.c (c_common_reswords): Add char8_t. (fix_string_type): Use char8_t for the type of u8 string literals. (c_common_get_alias_set): char8_t doesn't alias. (c_common_nodes_and_builtins): Define char8_t as a builtin type in C++. (c_stddef_cpp_builtins): Add __CHAR8_TYPE__. (keyword_begins_type_specifier): Add RID_CHAR8. * gcc/c-family/c-common.h (rid): Add RID_CHAR8. (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE. Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS. Define char8_type_node and char8_array_type_node. * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine __GCC_ATOMIC_CHAR8_T_LOCK_FREE. (c_cpp_builtins): Predefine __cpp_char8_t. * c-family/c-lex.c (lex_string): Use char8_array_type_node as the type of CPP_UTF8STRING. (lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR. * c-family/c.opt: Add the -fchar8_t command line option. gcc/c/ChangeLog: 2018-11-04 Tom Honermann * c/c-typeck.c (char_type_p): Add char8_type_node. (digest_init): Handle initialization by a u8 string literal of char8_t type. gcc/cp/ChangeLog: 2018-11-04 Tom Honermann * cp/cvt.c (type_promotes_to): Handle char8_t promotion. * cp/decl.c (grokdeclarator): Handle invalid type specifier combinations involving char8_t. * cp/lex.c (init_reswords): Add char8_t as a reserved word. * cp/mangle.c (write_builtin_type): Add name mangling for char8_t (Du). * cp/parser.c (cp_keyword_starts_decl_specifier_p, cp_parser_simple_type_specifier): Recognize char8_t as a simple type specifier. (cp_parser_string_literal): Use char8_array_type_node for the type of CPP_UTF8STRING. (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in system headers. * cp/rtti.c (emit_support_tinfos): type_info support for char8_t. * cp/tree.c (char_type_p): Recognize char8_t as a character type. * cp/typeck.c (string_conv_p): Handle conversions of u8 string literals of char8_t type. (check_literal_operator_args): Handle UDLs with u8 string literals of char8_t type. * cp/typeck2.c (digest_init_r): Disallow initializing a char array with a u8 string literal. libiberty/ChangeLog: 2018-10-31 Tom Honermann * cp-demangle.c (cplus_demangle_builtin_types, cplus_demangle_type): Add name demangling for char8_t (Du). * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate the new char8_t type. @@ -3543,6 +3556,10 @@ c_common_get_alias_set (tree t) if (!TYPE_P (t)) return -1; + /* Unlike char, char8_t doesn't alias. */ + if (flag_char8_t && t == char8_type_node) + return -1; This seems unnecessary; doesn't the existing code have the same effect? I think we could do with just an adjustment to the existing comment. I'm not sure. I had concerns about unintended matching due to char8_t having an underlying type of unsigned char. That shouldn't be a problem: if char8_t is a distinct type, it won't match unsigned char, and if it's the same as unsigned char, flag_char8_t will be false. + else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node) + || (flag_char8_t && type == char8_type_node) + bool char8_array = (flag_char8_t && !!comptypes (typ1, char8_type_node)); + || (flag_char8_t && type == char8_type_node In many places you check the flag and then for one of the char8 types. Since the types won't be used without the flag, checking the flag seems redundant? This was again protection against unintended matching of the underlying unsigned char type, particularly when compiling as C. char8_type_node is constructed (in c_common_nodes_
Re: [REVISED PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates
Attached is a revised patch that addresses feedback provided by Jason and Sandra. Changes from the prior patch include: - Updates to the -fchar8_t option documentation as requested by Jason. - Corrections for indentation, spacing, hyphenation, and wrapping as requested by Sandra. Tested on x86_64-linux. gcc/ChangeLog: 2018-11-04 Tom Honermann * doc/invoke.texi (-fchar8_t): Document new option. Tom. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 57491f1033c..95374951d98 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -206,7 +206,7 @@ in the following sections. @item C++ Language Options @xref{C++ Dialect Options,,Options Controlling C++ Dialect}. @gccoptlist{-fabi-version=@var{n} -fno-access-control @gol --faligned-new=@var{n} -fargs-in-order=@var{n} -fcheck-new @gol +-faligned-new=@var{n} -fargs-in-order=@var{n} -fchar8_t -fcheck-new @gol -fconstexpr-depth=@var{n} -fconstexpr-loop-limit=@var{n} @gol -fno-elide-constructors @gol -fno-enforce-eh-specs @gol @@ -2432,6 +2432,60 @@ but few users will need to override the default of This flag is enabled by default for @option{-std=c++17}. +@item -fchar8_t +@itemx -fno-char8_t +@opindex fchar8_t +@opindex fno-char8_t +Enable support for @code{char8_t} as adopted for C++2a. This includes +the addition of a new @code{char8_t} fundamental type, changes to the +types of UTF-8 string and character literals, new signatures for +user-defined literals, associated standard library updates, and new +@code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature test macros. + +This option enables functions to be overloaded for ordinary and UTF-8 +strings: + +@smallexample +int f(const char *);// #1 +int f(const char8_t *); // #2 +int v1 = f("text"); // Calls #1 +int v2 = f(u8"text"); // Calls #2 +@end smallexample + +@noindent +and introduces new signatures for user-defined literals: + +@smallexample +int operator""_udl1(char8_t); +int v3 = u8'x'_udl1; +int operator""_udl2(const char8_t*, std::size_t); +int v4 = u8"text"_udl2; +template int operator""_udl3(); +int v5 = u8"text"_udl3; +@end smallexample + +@noindent +The change to the types of UTF-8 string and character literals introduces +incompatibilities with ISO C++11 and later standards. For example, the +following code is well-formed under ISO C++11, but is ill-formed when +@option{-fchar8_t} is specified. + +@smallexample +char ca[] = u8"xx"; // error: char-array initialized from wide +//string +const char *cp = u8"xx";// error: invalid conversion from +//`const char8_t*' to `const char*' +int f(const char*); +auto v = f(u8"xx"); // error: invalid conversion from +//`const char8_t*' to `const char*' +std::string s@{u8"xx"@}; // error: no matching function for call to +//`std::basic_string::basic_string()' +using namespace std::literals; +s = u8"xx"s;// error: conversion from +//`basic_string' to non-scalar +//type `basic_string' requested +@end smallexample + @item -fcheck-new @opindex fcheck-new Check that the pointer returned by @code{operator new} is non-null
Re: [REVISED PATCH 2/9]: C++ P0482R5 char8_t: Core language support
Attached is a revised patch that addresses changes in P0482R6 as well as feedback provided by Jason. Changes from the prior patch include: - Updated the value of the __cpp_char8_t feature test macro to 201811 per P0482R6. - Enable char8_t support with -std=c++2a per adoption of P0482R6 in San Diego. - Reverted the unnecessary changes to gcc/gcc/c/c-typeck.c as requested by Jason. - Removed unnecessary checks of 'flag_char8_t' within the C++ front end as requested by Jason. - Corrected the regression spotted by Jason regarding initialization of signed char and unsigned char arrays with string literals. - Made minor changes to the error message emitted for ill-formed initialization of char arrays with UTF-8 string literals. These changes do not yet implement Jason's suggestion; I'll follow up with a separate patch for that due to additional test impact. Tested on x86_64-linux. gcc/ChangeLog: 2018-11-04 Tom Honermann * defaults.h: Define CHAR8_TYPE. gcc/c-family/ChangeLog: 2018-11-04 Tom Honermann * c-family/c-common.c (c_common_reswords): Add char8_t. (fix_string_type): Use char8_t for the type of u8 string literals. (c_common_get_alias_set): char8_t doesn't alias. (c_common_nodes_and_builtins): Define char8_t as a builtin type in C++. (c_stddef_cpp_builtins): Add __CHAR8_TYPE__. (keyword_begins_type_specifier): Add RID_CHAR8. * c-family/c-common.h (rid): Add RID_CHAR8. (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE. Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS. Define char8_type_node and char8_array_type_node. * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine __GCC_ATOMIC_CHAR8_T_LOCK_FREE. (c_cpp_builtins): Predefine __cpp_char8_t. * c-family/c-lex.c (lex_string): Use char8_array_type_node as the type of CPP_UTF8STRING. (lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR. * c-family/c-opts.c: If not otherwise specified, enable -fchar8_t when targeting C++2a. * c-family/c.opt: Add the -fchar8_t command line option. gcc/cp/ChangeLog: 2018-11-04 Tom Honermann * cp/cvt.c (type_promotes_to): Handle char8_t promotion. * cp/decl.c (grokdeclarator): Handle invalid type specifier combinations involving char8_t. * cp/lex.c (init_reswords): Add char8_t as a reserved word. * cp/mangle.c (write_builtin_type): Add name mangling for char8_t (Du). * cp/parser.c (cp_keyword_starts_decl_specifier_p, cp_parser_simple_type_specifier): Recognize char8_t as a simple type specifier. (cp_parser_string_literal): Use char8_array_type_node for the type of CPP_UTF8STRING. (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in system headers. * cp/rtti.c (emit_support_tinfos): type_info support for char8_t. * cp/tree.c (char_type_p): Recognize char8_t as a character type. * cp/typeck.c (string_conv_p): Handle conversions of u8 string literals of char8_t type. (check_literal_operator_args): Handle UDLs with u8 string literals of char8_t type. * cp/typeck2.c (digest_init_r): Disallow initializing a char array with a u8 string literal. libiberty/ChangeLog: 2018-10-31 Tom Honermann * cp-demangle.c (cplus_demangle_builtin_types, cplus_demangle_type): Add name demangling for char8_t (Du). * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate the new char8_t type. Tom. diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index f10cf89c3a7..b387daca137 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -79,6 +79,7 @@ machine_mode c_default_pointer_mode = VOIDmode; tree signed_char_type_node; tree wchar_type_node; + tree char8_type_node; tree char16_type_node; tree char32_type_node; @@ -128,6 +129,11 @@ machine_mode c_default_pointer_mode = VOIDmode; tree wchar_array_type_node; + Type `char8_t[SOMENUMBER]' or something like it. + Used when a UTF-8 string literal is created. + + tree char8_array_type_node; + Type `char16_t[SOMENUMBER]' or something like it. Used when a UTF-16 string literal is created. @@ -450,6 +456,7 @@ const struct c_common_resword c_common_reswords[] = { "case", RID_CASE, 0 }, { "catch", RID_CATCH, D_CXX_OBJC | D_CXXWARN }, { "char", RID_CHAR, 0 }, + { "char8_t", RID_CHAR8, D_CXX_CHAR8_T_FLAGS | D_CXXWARN }, { "char16_t", RID_CHAR16, D_CXXONLY | D_CXX11 | D_CXXWARN }, { "char32_t", RID_CHAR32, D_CXXONLY | D_CXX11 | D_CXXWARN }, { "class", RID_CLASS, D_CXX_OBJC | D_CXXWARN }, @@ -746,6 +753,11 @@ fix_string_type (tree value) nchars = length; e_type = char_type_node; } + else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node) +{ + nchars = length / (TYPE_P
Re: [REVISED PATCH 3/9]: C++ P0482R5 char8_t: New core language tests
Attached is a revised patch that addresses changes in P0482R6 as well as feedback provided by Jason for patch 2/9. Changes from the prior patch include: - New tests to ensure -fchar8_t is implicitly enabled when targeting C++2a per adoption of P0482R6 in San Diego. - gcc/testsuite/g++.dg/cpp2a/char8_t1.C - gcc/testsuite/g++.dg/cpp2a/char8_t2.C - Updated the value of the __cpp_char8_t feature test macro to 201811 per P0482R6. - Updated tests to exercise initialization of signed char and unsigned char arrays with ordinary and UTF-8 string literals. - gcc/testsuite/g++.dg/ext/char8_t-init-1.C - gcc/testsuite/g++.dg/ext/char8_t-init-2.C Tested on x86_64-linux. gcc/testsuite/ChangeLog: 2018-11-04 Tom Honermann * g++.dg/cpp0x/udlit-implicit-conv-neg-char8_t.C: New test cloned from udlit-implicit-conv-neg.C. Validates handling of ill-formed uses of char8_t based user defined literals. * g++.dg/cpp0x/udlit-resolve-char8_t.C: New test cloned from udlit-resolve.C. Validates handling of well-formed uses of char8_t based user defined literals. * g++.dg/cpp2a/char8_t1.C: New test; validates char8_t support is implicitly enabled when targeting C++2a. * g++.dg/cpp2a/char8_t2.C: New test; validates char8_t support is disabled when -fno-char8_t is specified when targeting C++2a. * g++.dg/ext/char8_t-aliasing-1.C: New test; validates warnings for type punning with char8_t types. Illustrates that char8_t does not alias. * g++.dg/ext/char8_t-char-literal-1.C: New test; validates u8 character literals have type char if char8_t support is not enabled. * g++.dg/ext/char8_t-char-literal-2.C: New test; validates u8 character literals have type char8_t if char8_t support is enabled. * g++.dg/ext/char8_t-deduction-1.C: New test; validates char is deduced for u8 character and string literals if char8_t support is not enabled. * g++.dg/ext/char8_t-deduction-2.C: New test; validates char8_t is deduced for u8 character and string literals if char8_t support is enabled. * g++.dg/ext/char8_t-feature-test-macro-1.C: New test; validates that the __cpp_char8_t feature test macro is not defined if char8_t support is not enabled. * g++.dg/ext/char8_t-feature-test-macro-2.C: New test; validates that the __cpp_char8_t feature test macro is defined with the correct value if char8_t support is enabled. * g++.dg/ext/char8_t-init-1.C: New test; validates initialization by u8 character and string literals when support for char8_t is not enabled. * g++.dg/ext/char8_t-init-2.C: New test; validates initialization by u8 character and string literals when support for char8_t is enabled. * g++.dg/ext/char8_t-keyword-1.C: New test; validates that char8_t is not a keyword if support for char8_t is not enabled. * g++.dg/ext/char8_t-keyword-2.C: New test; validates that char8_t is a keyword if support for char8_t is enabled. * g++.dg/ext/char8_t-limits-1.C: New test; validates that char8_t is unsigned and sufficiently large to store the required range of char8_t values. * g++.dg/ext/char8_t-overload-1.C: New test; validates overload resolution for u8 character and string literal arguments when support for char8_t is not enabled. * g++.dg/ext/char8_t-overload-2.C: New test; validates overload resolution for u8 character and string literal arguments when support for char8_t is enabled. * g++.dg/ext/char8_t-predefined-macros-1.C: New test; validates that the __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macros are not defined when support for char8_t is not enabled. * g++.dg/ext/char8_t-predefined-macros-2.C: New test; validates that the __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macros are defined when support for char8_t is enabled. * g++.dg/ext/char8_t-sizeof-1.C: New test; validates that the size of char8_t and u8 character literals is 1. * g++.dg/ext/char8_t-specialization-1.C: New test; validate template specialization for u8 character literal template arguments when support for char8_t is not enabled. * g++.dg/ext/char8_t-specialization-2.C: New test; validate template specialization for char8_t and u8 character literal template arguments when support for char8_t is enabled. * g++.dg/ext/char8_t-string-literal-1.C: New test; validate the type of u8 string literals when support for char8_t is not enabled. * g++.dg/ext/char8_t-string-literal-2.C: New test; validate the type of u8 string literals when support for char8_t is enabled. * g++.dg/ext/char8_t-type-specifier-1.C: New test; validate that char8_t is not recognized as a type specifier when support for char8_t is not enabled. * g++.dg/ext/char8_t-type-specifier-2.C: New test; validate
Re: [REVISED PATCH 4/9]: C++ P0482R5 char8_t: Updates to existing core language tests
Attached is a revised patch that addresses changes in P0482R6 and adoption of P0482R6 for C++20 in San Diego. Changes from the prior patch include: - Updated a test to validate the value of the __cpp_char8_t feature test macro when targeting C++2a. Tested on x86_64-linux. gcc/testsuite/ChangeLog: 2018-11-04 Tom Honermann * c-c++-common/raw-string-13.c: Added test cases for u8 raw string literals. * c-c++-common/raw-string-15.c: Likewise. * g++.dg/cpp0x/constexpr-wstring2.C: Added test cases for u8 literals. * g++.dg/cpp2a/feat-cxx2a.C: Added test cases for the __cpp_char8_t feature test macro. * g++.dg/ext/utf-array-short-wchar.C: Likewise. * g++.dg/ext/utf-array.C: Likewise. * g++.dg/ext/utf-cxx98.C: Likewise. * g++.dg/ext/utf-dflt.C: Likewise. * g++.dg/ext/utf-gnuxx98.C: Likewise. * gcc.dg/utf-array-short-wchar.c: Likewise. * gcc.dg/utf-array.c: Likewise. Tom. diff --git a/gcc/testsuite/c-c++-common/raw-string-13.c b/gcc/testsuite/c-c++-common/raw-string-13.c index 1b37405cee9..fa11edaa7aa 100644 --- a/gcc/testsuite/c-c++-common/raw-string-13.c +++ b/gcc/testsuite/c-c++-common/raw-string-13.c @@ -62,6 +62,47 @@ const char s16[] = R"??(??)??"; const char s17[] = R"?(?)??)?"; const char s18[] = R"??(??)??)??)??"; +const char u800[] = u8R"??=??()??'??!??-\ +(a)#[{}]^|~"; +)??=??"; +const char u801[] = u8R"a( +)\ +a" +)a"; +const char u802[] = u8R"a( +)a\ +" +)a"; +const char u803[] = u8R"ab( +)a\ +b" +)ab"; +const char u804[] = u8R"a??/(x)a??/"; +const char u805[] = u8R"abcdefghijklmn??(abc)abcdefghijklmn??"; +const char u806[] = u8R"abcdefghijklm??/(abc)abcdefghijklm??/"; +const char u807[] = u8R"abc(??)\ +abc";)abc"; +const char u808[] = u8R"def(de)\ +def";)def"; +const char u809[] = u8R"a(??)\ +a" +)a"; +const char u810[] = u8R"a(??)a\ +" +)a"; +const char u811[] = u8R"ab(??)a\ +b" +)ab"; +const char u812[] = u8R"a#(a#)a??=)a#"; +const char u813[] = u8R"a#(??)a??=??)a#"; +const char u814[] = u8R"??/(x)??/ +";)??/"; +const char u815[] = u8R"??/(??)??/ +";)??/"; +const char u816[] = u8R"??(??)??"; +const char u817[] = u8R"?(?)??)?"; +const char u818[] = u8R"??(??)??)??)??"; + const char16_t u00[] = uR"??=??()??'??!??-\ (a)#[{}]^|~"; )??=??"; @@ -211,6 +252,25 @@ main (void) TEST (s16, "??"); TEST (s17, "?)??"); TEST (s18, "??"")??"")??"); + TEST (u800, u8"??""??"")??""'??""!??""-\\\n(a)#[{}]^|~\";\n"); + TEST (u801, u8"\n)\\\na\"\n"); + TEST (u802, u8"\n)a\\\n\"\n"); + TEST (u803, u8"\n)a\\\nb\"\n"); + TEST (u804, u8"x"); + TEST (u805, u8"abc"); + TEST (u806, u8"abc"); + TEST (u807, u8"??"")\\\nabc\";"); + TEST (u808, u8"de)\\\ndef\";"); + TEST (u809, u8"??"")\\\na\"\n"); + TEST (u810, u8"??"")a\\\n\"\n"); + TEST (u811, u8"??"")a\\\nb\"\n"); + TEST (u812, u8"a#)a??""="); + TEST (u813, u8"??"")a??""=??"); + TEST (u814, u8"x)??""/\n\";"); + TEST (u815, u8"??"")??""/\n\";"); + TEST (u816, u8"??"); + TEST (u817, u8"?)??"); + TEST (u818, u8"??"")??"")??"); TEST (u00, u"??""??"")??""'??""!??""-\\\n(a)#[{}]^|~\";\n"); TEST (u01, u"\n)\\\na\"\n"); TEST (u02, u"\n)a\\\n\"\n"); diff --git a/gcc/testsuite/c-c++-common/raw-string-15.c b/gcc/testsuite/c-c++-common/raw-string-15.c index 9dfdaabd87d..1d101dc8393 100644 --- a/gcc/testsuite/c-c++-common/raw-string-15.c +++ b/gcc/testsuite/c-c++-common/raw-string-15.c @@ -62,6 +62,47 @@ const char s16[] = R"??(??)??"; const char s17[] = R"?(?)??)?"; const char s18[] = R"??(??)??)??)??"; +const char u800[] = u8R"??=??()??'??!??-\ +(a)#[{}]^|~"; +)??=??"; +const char u801[] = u8R"a( +)\ +a" +)a"; +const char u802[] = u8R"a( +)a\ +" +)a"; +const char u803[] = u8R"ab( +)a\ +b" +)ab"; +const char u804[] = u8R"a??/(x)a??/"; +const char u805[] = u8R"abcdefghijklmn??(abc)abcdefghijklmn??"; +const char u806[] = u8R"abcdefghijklm??/(abc)abcdefghijklm??/"; +const char u807[] = u8R"abc(??)\ +abc";)abc"; +const char u808[] = u8R"def(de)\ +de
Re: [REVISED PATCH 5/9]: C++ P0482R5 char8_t: Standard library support
Attached is a revised patch that addresses changes in P0482R6. Changes from the prior patch include: - Updated the value of the __cpp_char8_t feature test macro to 201811. Tested on x86_64-linux. gcc/cp/ChangeLog: 2018-11-04 Tom Honermann * name-lookup.c (get_std_name_hint): Added u8string as a name hint. libstdc++-v3/ChangeLog: 2018-11-04 Tom Honermann * config/abi/pre/gnu-versioned-namespace.ver (CXXABI_2.0): Add typeinfo symbols for char8_t. * config/abi/pre/gnu.ver: Add CXXABI_1.3.12. (GLIBCXX_3.4.26): Add symbols for specializations of numeric_limits and codecvt that involve char8_t. (CXXABI_1.3.12): Add typeinfo symbols for char8_t. * include/bits/atomic_base.h: Add atomic_char8_t. * include/bits/basic_string.h: Add std::hash and operator""s(const char8_t*, size_t). * include/bits/c++config: Define _GLIBCXX_USE_CHAR8_T and __cpp_lib_char8_t. * include/bits/char_traits.h: Add char_traits. * include/bits/codecvt.h: Add codecvt, codecvt, codecvt_byname, and codecvt_byname. * include/bits/cpp_type_traits.h: Add __is_integer to recognize char8_t as an integral type. * include/bits/fs_path.h: (path::__is_encoded_char): Recognize char8_t. (path::u8string): Return std::u8string when char8_t support is enabled. (path::generic_u8string): Likewise. (path::_S_convert): Handle conversion from char8_t input. (path::_S_str_convert): Likewise. * include/bits/functional_hash.h: Add hash. * include/bits/locale_conv.h (__str_codecvt_out): Add overloads for char8_t. * include/bits/locale_facets.h (_GLIBCXX_NUM_UNICODE_FACETS): Bump for new char8_t specializations. * include/bits/localefwd.h: Add missing declarations of codecvt and codecvt. Add char8_t declarations codecvt and codecvt. * include/bits/postypes.h: Add u8streampos * include/bits/stringfwd.h: Add declarations of char_traits and u8string. * include/c_global/cstddef: Add __byte_operand. * include/experimental/bits/fs_path.h (path::__is_encoded_char): Recognize char8_t. (path::u8string): Return std::u8string when char8_t support is enabled. (path::generic_u8string): Likewise. (path::_S_convert): Handle conversion from char8_t input. (path::_S_str_convert): Likewise. * include/experimental/string: Add u8string. * include/experimental/string_view: Add u8string_view, hash, and operator""sv(const char8_t*, size_t). * include/std/atomic: Add atomic and atomic_char8_t. * include/std/charconv (__is_int_to_chars_type): Recognize char8_t as a character type. * include/std/limits: Add numeric_limits. * include/std/string_view: Add u8string_view, hash, and operator""sv(const char8_t*, size_t). * include/std/type_traits: Add __is_integral_helper, __make_unsigned, and __make_signed. * libsupc++/atomic_lockfree_defines.h: Define ATOMIC_CHAR8_T_LOCK_FREE. * src/c++11/Makefile.am: Compile with -fchar8_t when compiling codecvt.cc and limits.cc so that char8_t specializations of numeric_limits and codecvt and emitted. * src/c++11/Makefile.in: Likewise. * src/c++11/codecvt.cc: Define members of codecvt, codecvt, codecvt_byname, and codecvt_byname. * src/c++11/limits.cc: Define members of numeric_limits. * src/c++98/Makefile.am: Compile with -fchar8_t when compiling locale_init.cc and localename.cc. * src/c++98/Makefile.in: Likewise. * src/c++98/locale_init.cc: Add initialization for the codecvt and codecvt facets. * src/c++98/localename.cc: Likewise. * testsuite/util/testsuite_abi.cc: Validate ABI bump. Tom. diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c index 08632c382b7..5f2f8e865ca 100644 --- a/gcc/cp/name-lookup.c +++ b/gcc/cp/name-lookup.c @@ -5543,6 +5543,7 @@ get_std_name_hint (const char *name) {"basic_string", "", cxx98}, {"string", "", cxx98}, {"wstring", "", cxx98}, +{"u8string", "", cxx2a}, {"u16string", "", cxx11}, {"u32string", "", cxx11}, /* . */ diff --git a/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver b/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver index c448b813331..b26cf1dc8ac 100644 --- a/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver +++ b/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver @@ -301,6 +301,11 @@ CXXABI_2.0 { _ZTSN10__cxxabiv120__si_class_type_infoE; _ZTSN10__cxxabiv121__vmi_class_type_infoE; +# typeinfo for char8_t +_ZTIDu; +_ZTIPDu; +_ZTIPKDu; + # typeinfo for char16_t and char32_t _ZTIDs; _ZTIPDs; diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b
Re: [PATCH 2/9]: C++ P0482R5 char8_t: Core language support
Thanks, Jason. I just sent a revised set of patches addressing most of your feedback with exceptions as described inline below. On 12/17/18 4:47 PM, Tom Honermann wrote: On 12/17/18 4:02 PM, Jason Merrill wrote: On 12/5/18 11:16 AM, Jason Merrill wrote: On 12/5/18 2:09 AM, Tom Honermann wrote: On 12/3/18 5:01 PM, Jason Merrill wrote: On 12/3/18 4:51 PM, Jason Merrill wrote: On 11/5/18 2:39 PM, Tom Honermann wrote: This patch adds support for the P0482R5 core language changes. This includes: - The -fchar8_t and -fno_char8_t command line options. - char8_t as a keyword. - The char8_t builtin type as a non-aliasing unsigned integral character type of size 1. - Use of char8_t as a simple type specifier. - u8 character literals with type char8_t. - u8 string literals with type array of const char8_t. - User defined literal operators that accept char8_1 and char8_t pointer types. - New __cpp_char8_t predefined feature test macro. - New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macros . - Name mangling and demangling for char8_t (using Du). gcc/ChangeLog: 2018-11-04 Tom Honermann * defaults.h: Define CHAR8_TYPE. gcc/c-family/ChangeLog: 2018-11-04 Tom Honermann * c-family/c-common.c (c_common_reswords): Add char8_t. (fix_string_type): Use char8_t for the type of u8 string literals. (c_common_get_alias_set): char8_t doesn't alias. (c_common_nodes_and_builtins): Define char8_t as a builtin type in C++. (c_stddef_cpp_builtins): Add __CHAR8_TYPE__. (keyword_begins_type_specifier): Add RID_CHAR8. * gcc/c-family/c-common.h (rid): Add RID_CHAR8. (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE. Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS. Define char8_type_node and char8_array_type_node. * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine __GCC_ATOMIC_CHAR8_T_LOCK_FREE. (c_cpp_builtins): Predefine __cpp_char8_t. * c-family/c-lex.c (lex_string): Use char8_array_type_node as the type of CPP_UTF8STRING. (lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR. * c-family/c.opt: Add the -fchar8_t command line option. gcc/c/ChangeLog: 2018-11-04 Tom Honermann * c/c-typeck.c (char_type_p): Add char8_type_node. (digest_init): Handle initialization by a u8 string literal of char8_t type. gcc/cp/ChangeLog: 2018-11-04 Tom Honermann * cp/cvt.c (type_promotes_to): Handle char8_t promotion. * cp/decl.c (grokdeclarator): Handle invalid type specifier combinations involving char8_t. * cp/lex.c (init_reswords): Add char8_t as a reserved word. * cp/mangle.c (write_builtin_type): Add name mangling for char8_t (Du). * cp/parser.c (cp_keyword_starts_decl_specifier_p, cp_parser_simple_type_specifier): Recognize char8_t as a simple type specifier. (cp_parser_string_literal): Use char8_array_type_node for the type of CPP_UTF8STRING. (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in system headers. * cp/rtti.c (emit_support_tinfos): type_info support for char8_t. * cp/tree.c (char_type_p): Recognize char8_t as a character type. * cp/typeck.c (string_conv_p): Handle conversions of u8 string literals of char8_t type. (check_literal_operator_args): Handle UDLs with u8 string literals of char8_t type. * cp/typeck2.c (digest_init_r): Disallow initializing a char array with a u8 string literal. libiberty/ChangeLog: 2018-10-31 Tom Honermann * cp-demangle.c (cplus_demangle_builtin_types, cplus_demangle_type): Add name demangling for char8_t (Du). * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate the new char8_t type. @@ -3543,6 +3556,10 @@ c_common_get_alias_set (tree t) if (!TYPE_P (t)) return -1; + /* Unlike char, char8_t doesn't alias. */ + if (flag_char8_t && t == char8_type_node) + return -1; This seems unnecessary; doesn't the existing code have the same effect? I think we could do with just an adjustment to the existing comment. I'm not sure. I had concerns about unintended matching due to char8_t having an underlying type of unsigned char. That shouldn't be a problem: if char8_t is a distinct type, it won't match unsigned char, and if it's the same as unsigned char, flag_char8_t will be false. I tried removing this check and that resulted in test gcc/testsuite/g++.dg/ext/char8_t-aliasing-1.C (added in patch 3/9) failing. It seems this change is needed. If you believe that implies that something is wrong elsewhere, please let me know. + else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node) + || (flag_char8_t && type == char8_type_node) + bool char8_array = (flag_char8_t && !!com
Re: [REVISED PATCH 7/9]: C++ P0482R5 char8_t: New standard library tests
Attached is a revised patch that addresses changes in P0482R6. Changes from the prior patch include: - Updated the value of the __cpp_char8_t feature test macro to 201811. Tested on x86_64-linux. libstdc++-v3/ChangeLog: 2018-11-04 Tom Honermann * testsuite/18_support/numeric_limits/char8_t.cc: New test cloned from char16_32_t.cc; validates numeric_limits. * testsuite/21_strings/basic_string/literals/types-char8_t.cc: New test cloned from types.cc; validates operator""s for char8_t returns u8string. * testsuite/21_strings/basic_string/literals/values-char8_t.cc: New test cloned from values.cc; validates construction and comparison of u8string values. * testsuite/21_strings/basic_string/requirements/ /explicit_instantiation/char8_t/1.cc: New test cloned from char16_t/1.cc; validates explicit instantiation of basic_string. * testsuite/21_strings/basic_string_view/literals/types-char8_t.cc: New test cloned from types.cc; validates operator""sv for char8_t returns u8string_view. * testsuite/21_strings/basic_string_view/literals/ values-char8_t.cc: New test cloned from values.cc; validates construction and comparison of u8string_view values. * testsuite/21_strings/basic_string_view/requirements/ explicit_instantiation/char8_t/1.cc: New test cloned from char16_t/1.cc; validates explicit instantiation of basic_string_view. * testsuite/21_strings/char_traits/requirements/char8_t/65049.cc: New test cloned from char16_t/65049.cc; validates that char_traits is not vulnerable to the concerns in PR65049. * testsuite/21_strings/char_traits/requirements/char8_t/ typedefs.cc: New test cloned from char16_t/typedefs.cc; validates that char_traits member typedefs are present and correct. * testsuite/21_strings/char_traits/requirements/ explicit_instantiation/char8_t/1.cc: New test cloned from char16_t/1.cc; validates explicit instantiation of char_traits. * testsuite/22_locale/codecvt/char16_t-char8_t.cc: New test cloned from char16_t.cc: validates codecvt. * testsuite/22_locale/codecvt/char32_t-char8_t.cc: New test cloned from char32_t.cc: validates codecvt. * testsuite/22_locale/codecvt/utf8-char8_t.cc: New test cloned from utf8.cc; validates codecvt and codecvt. * testsuite/27_io/filesystem/path/native/string-char8_t.cc: New test cloned from string.cc; validates filesystem::path construction from char8_t input. * testsuite/experimental/feat-char8_t.cc: New test; validates that the __cpp_lib_char8_t feature test macro is defined with the correct value. * testsuite/experimental/filesystem/path/native/string-char8_t.cc: New test cloned from string.cc; validates filesystem::path construction from char8_t input. * testsuite/experimental/string_view/literals/types-char8_t.cc: New test cloned from types.cc; validates operator""sv for char8_t returns u8string_view. * testsuite/experimental/string_view/literals/values-char8_t.cc: New test cloned from values.cc; validates construction and comparison of u8string_view values. * testsuite/experimental/string_view/requirements/ explicit_instantiation/char8_t/1.cc: New test cloned from char16_t/1.cc; validates explicit instantiation of basic_string_view. * testsuite/ext/char8_t/atomic-1.cc: New test; validates that ATOMIC_CHAR8_T_LOCK_FREE is not defined if char8_t support is not enabled. Tom. diff --git a/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc b/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc new file mode 100644 index 000..346463d7244 --- /dev/null +++ b/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc @@ -0,0 +1,71 @@ +// { dg-do run { target c++11 } } +// { dg-require-cstdint "" } +// { dg-options "-fchar8_t" } + +// Copyright (C) 2017 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// <http://www.gnu.org/licenses/>. + +#include +#include +#include + +// Test specializations for char8_t. +template + void + do_test() + { +typedef std::numeric_limits char_type; +typedef std::numeric_li
Re: [PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates
Thank you, Sandra! I just sent a revised patch to the list that I believe addresses all of your comments. Thanks for the suggestion to generate and check the pdf, that was helpful to ensure the changes rendered correctly. Tom. On 12/11/18 6:35 PM, Sandra Loosemore wrote: On 11/5/18 12:39 PM, Tom Honermann wrote: This patch adds documentation for new -fchar8_t and -fno-char8_t options. gcc/ChangeLog: 2018-11-04 Tom Honermann * doc/invoke.texi (-fchar8_t): Document new option. My comments are all about nitpicky formatting things. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 57491f1033c..cd3a2a715db 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -206,7 +206,7 @@ in the following sections. @item C++ Language Options @xref{C++ Dialect Options,,Options Controlling C++ Dialect}. @gccoptlist{-fabi-version=@var{n} -fno-access-control @gol --faligned-new=@var{n} -fargs-in-order=@var{n} -fcheck-new @gol +-faligned-new=@var{n} -fargs-in-order=@var{n} -fchar8_t -fcheck-new @gol Please consistently use 2 spaces (not just 1) to separate options on the same line in a @gccoptlist environment. -fconstexpr-depth=@var{n} -fconstexpr-loop-limit=@var{n} @gol -fno-elide-constructors @gol -fno-enforce-eh-specs @gol @@ -2432,6 +2432,53 @@ but few users will need to override the default of This flag is enabled by default for @option{-std=c++17}. +@item -fchar8_t +@itemx -fno-char8_t +@opindex fchar8_t +@opindex fno-char8_t +Enable support for the P0482 proposal including the addition of a +new @code{char8_t} fundamental type, changes to the types of UTF-8 +string and character literals, new signatures for user defined +literals, and new specializations of standard library class templates +@code{std::numeric_limits}, @code{std::char_traits}, +and @code{std::hash}. + +This option enables functions to be overloaded for ordinary and UTF-8 +strings: + +@smallexample +int f(const char *); // #1 +int f(const char8_t *); // #2 +int v1 = f("text"); // Calls #1 +int v2 = f(u8"text"); // Calls #2 +@end smallexample + +and introduces new signatures for user defined literals: @noindent immediately before the continued sentence of the paragraph before the example. Also please hyphenate "user-defined" here. + +@smallexample +int operator""_udl1(char8_t); +int v3 = u8'x'_udl1; +int operator""_udl2(const char8_t*, std::size_t); +int v4 = u8"text"_udl2; +template int operator""_udl3(); +int v5 = u8"text"_udl3; +@end smallexample + +The change to the types of UTF-8 string and character literals introduces +incompatibilities with ISO C++11 and later standards. For example, the +following code is well-formed under ISO C++11, but is ill-formed when +@option{-fchar8_t} is specified. + +@smallexample +char ca[] = u8"text"; // error: char-array initialized from wide string +const char *cp = u8"text"; // error: invalid conversion from 'const char8_t*' to 'const char*' +int f(const char*); +auto v = f(u8"text"); // error: invalid conversion from 'const char8_t*' to 'const char*' +std::string s1@{u8"text"@}; // error: no matching function for call to 'std::basic_string::basic_string()' +using namespace std::literals; +std::string s2 = u8"text"s; // error: conversion from 'basic_string' to non-scalar type 'basic_string' requested +@end smallexample The formatting of this code example is way too wide to fit on the page of the printed/PDF manual. I suggest putting the comments on separate lines from the code and breaking them across multiple lines where necessary. If you format the example for <80 columns it will probably fit, although you should check the PDF if at all possible. + @item -fcheck-new @opindex fcheck-new Check that the pointer returned by @code{operator new} is non-null -Sandra
PATCH: Updated error messages for ill-formed cases of array initialization by string literal
As requested by Jason in the review of the P0482 (char8_t) core language changes, this patch includes updates to the error messages emitted for ill-formed cases of array initialization with a string literal. With these changes, error messages that previously looked something like these: - "char-array initialized from wide string" - "wide character array initialized from non-wide string" - "wide character array initialized from incompatible wide string" now look like: - "cannot initialize array of type 'char' from a string literal with type array of 'short unsigned int'" - "cannot initialize array of type 'short unsigned int' from a string literal with type array of 'char'" - "cannot initialize array of type 'short unsigned int' from a string literal with type array of 'unsigned int'" These changes affect both the C and C++ front ends. These changes have dependencies on the (revised) set of patches submitted for P0482 (char8_t) and will not apply cleanly without them. Tested on x86_64-linux. gcc/c/ChangeLog: 2018-12-26 Tom Honermann * c-typeck.c (digest_init): Revised the error message produced for ill-formed cases of array initialization with a string literal. gcc/cp/ChangeLog: 2018-12-26 Tom Honermann * typeck2.c (digest_init_r): Revised the error message produced for ill-formed cases of array initialization with a string literal. gcc/testsuite/ChangeLog: 2018-12-26 Tom Honermann * gcc/testsuite/g++.dg/ext/char8_t-init-2.C: Updated the expected error messages for ill-formed cases of array initialization with a string literal. * gcc/testsuite/g++.dg/ext/utf-array-short-wchar.C: Likewise. * gcc/testsuite/g++.dg/ext/utf-array.C: Likewise. * gcc/testsuite/g++.dg/ext/utf8-2.C: Likewise. * gcc/testsuite/gcc.dg/init-string-2.c: Likewise. * gcc/testsuite/gcc.dg/pr61096-1.c: Likewise. * gcc/testsuite/gcc.dg/utf-array-short-wchar.c: Likewise. * gcc/testsuite/gcc.dg/utf-array.c: Likewise. * gcc/testsuite/gcc.dg/utf8-2.c: Likewise. Tom. diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 9d09b8d65fd..4d2129dff2f 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -7447,6 +7447,7 @@ digest_init (location_t init_loc, tree type, tree init, tree origtype, { struct c_expr expr; tree typ2 = TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (inside_init))); + bool incompat_string_cst = false; expr.value = inside_init; expr.original_code = (strict_string ? STRING_CST : ERROR_MARK); expr.original_type = NULL; @@ -7464,27 +7465,22 @@ digest_init (location_t init_loc, tree type, tree init, tree origtype, { if (typ2 != char_type_node) { - error_init (init_loc, "char-array initialized from wide " - "string"); - return error_mark_node; + incompat_string_cst = true; } } - else + else if (!comptypes(typ1, typ2)) { - if (typ2 == char_type_node) - { - error_init (init_loc, "wide character array initialized " - "from non-wide string"); - return error_mark_node; - } - else if (!comptypes(typ1, typ2)) - { - error_init (init_loc, "wide character array initialized " - "from incompatible wide string"); - return error_mark_node; - } + incompat_string_cst = true; } + if (incompat_string_cst) +{ + error_at (init_loc, "cannot initialize array of type %qT from " + "a string literal with type array of %qT", + typ1, typ2); + return error_mark_node; +} + if (TYPE_DOMAIN (type) != NULL_TREE && TYPE_SIZE (type) != NULL_TREE && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST) diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c index 782fd7f9cd5..ae3b53dc001 100644 --- a/gcc/cp/typeck2.c +++ b/gcc/cp/typeck2.c @@ -1060,46 +1060,43 @@ digest_init_r (tree type, tree init, int nested, int flags, && TREE_CODE (init) == STRING_CST) { tree char_type = TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (init))); + bool incompat_string_cst = false; - if (TYPE_PRECISION (typ1) == BITS_PER_UNIT) + if (typ1 != char_type) { - if (typ1 != char8_type_node && char_type == char8_type_node) + /* The array element type does not match the initializing string + literal element type. This is only allowed when initializing + an array of signed char or unsigned char. */ + if (TYPE_PRECISION (typ1) == BITS_PER_UNIT) { - if (complain & tf_error) - error_at (loc, "char-array initialized from UTF-8 string"); - return error_mark_node; - } - else if (typ1 == char8_type_node && char_type == char_type_node) - { - if (complain
[PATCH]: Fix PR c++/88095, class template argument deduction for literal operator templates per P0732 for C++2a
This patch fixes PR c++/88095: - Bug 88095 - class nontype template parameter UDL string literals doesn't accepts deduction placeholder - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88095. It also addresses a latent issue; literal operator templates with template parameter packs of literal class type were previously accepted. The patch corrects this and adds a test (udlit-class-nttp-neg.C). In the change to gcc/cp/parser.c, it is not clear to me whether the 'TREE_CODE (TREE_TYPE (parm)) == TEMPLATE_TYPE_PARM' comparison is necessary; it might be that 'CLASS_PLACEHOLDER_TEMPLATE' suffices on its own. If accepted, I'd like to request this change be applied to gcc 9 as it is needed for one of the char8_t remediation approaches documented in P1423, and may be helpful for existing code bases impacted by the char8_t changes adopted via P0482 for C++20. - https://wg21.link/p1423#emulate Tested on x86_64-linux. Thanks to Jeff Snyder for providing an initial patch in the 88059 PR. gcc/cp/ChangeLog: 2019-08-02 Tom Honermann * parser.c (cp_parser_template_declaration_after_parameters): Enable class template argument deduction for non-type template parameters in literal operator templates. gcc/testsuite/ChangeLog: 2019-08-02 Tom Honermann PR c++/88095 * g++.dg/cpp2a/udlit-class-nttp-ctad.C: New test. * g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C: New test. * g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C: New test. * g++.dg/cpp2a/udlit-class-nttp.C: New test. * g++.dg/cpp2a/udlit-class-nttp-neg.C: New test. * g++.dg/cpp2a/udlit-class-nttp-neg2.C: New test. diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog index c9091f523c5..a406bba41c5 100644 --- a/gcc/cp/ChangeLog +++ b/gcc/cp/ChangeLog @@ -1,3 +1,9 @@ +2019-08-02 Tom Honermann + + * parser.c (cp_parser_template_declaration_after_parameters): Enable + class template argument deduction for non-type template parameters in + literal operator templates. + 2019-07-16 Jason Merrill * parser.c (make_location): Add overload taking cp_lexer* as last diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 1a5da1dd8e8..86f895e96a3 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -28105,7 +28105,10 @@ cp_parser_template_declaration_after_parameters (cp_parser* parser, { tree parm_list = TREE_VEC_ELT (parameter_list, 0); tree parm = INNERMOST_TEMPLATE_PARMS (parm_list); - if (CLASS_TYPE_P (TREE_TYPE (parm))) + if ((CLASS_TYPE_P (TREE_TYPE (parm)) + || (TREE_CODE (TREE_TYPE (parm)) == TEMPLATE_TYPE_PARM + && CLASS_PLACEHOLDER_TEMPLATE (TREE_TYPE (parm + && !TEMPLATE_PARM_PARAMETER_PACK (DECL_INITIAL (parm))) /* OK, C++20 string literal operator template. We don't need to warn in lower dialects here because we will have already warned about the template parameter. */; diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 0f47604da85..c8613deaae6 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,13 @@ +2019-08-02 Tom Honermann + + PR c++/88095 + * g++.dg/cpp2a/udlit-class-nttp-ctad.C: New test. + * g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C: New test. + * g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C: New test. + * g++.dg/cpp2a/udlit-class-nttp.C: New test. + * g++.dg/cpp2a/udlit-class-nttp-neg.C: New test. + * g++.dg/cpp2a/udlit-class-nttp-neg2.C: New test. + 2019-07-18 Jan Hubicka * g++.dg/lto/alias-5_0.C: New testcase. diff --git a/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C b/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C new file mode 100644 index 000..437fa9b5ab8 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C @@ -0,0 +1,24 @@ +// PR c++/88095 +// Test class non-type template parameters for literal operator templates. +// Validate handling of failed class template argument deduction. +// { dg-do compile { target c++2a } } + +namespace std { +using size_t = decltype(sizeof(int)); +} + +template +struct fixed_string { + constexpr static std::size_t length = N; + constexpr fixed_string(...) { } + // auto operator<=> (const fixed_string&) = default; +}; +// Missing deduction guide. + +template +constexpr std::size_t operator"" _udl() { + return decltype(fs)::length; +} + +static_assert("test"_udl == 5); // { dg-error "15:no matching function for call to" } +// { dg-error "15:class template argument deduction failed" "" { target *-*-* } .-1 } diff --git a/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C b/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C new file mode 100644 index 000..89bb5d39d7d --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C @@ -0,0 +1,20 @@ +// PR c++/88095 +// T
Re: [PATCH]: Fix PR c++/88095, class template argument deduction for literal operator templates per P0732 for C++2a
On 8/5/19 3:05 PM, Jason Merrill wrote: On 8/2/19 9:59 AM, Tom Honermann wrote: This patch fixes PR c++/88095: - Bug 88095 - class nontype template parameter UDL string literals doesn't accepts deduction placeholder - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88095. It also addresses a latent issue; literal operator templates with template parameter packs of literal class type were previously accepted. The patch corrects this and adds a test (udlit-class-nttp-neg.C). In the change to gcc/cp/parser.c, it is not clear to me whether the 'TREE_CODE (TREE_TYPE (parm)) == TEMPLATE_TYPE_PARM' comparison is necessary; it might be that 'CLASS_PLACEHOLDER_TEMPLATE' suffices on its own. template_placeholder_p would be a shorter way to write these, but I think even better would be to just change CLASS_TYPE_P to MAYBE_CLASS_TYPE_P. I'll make that change and commit the patch, since it looks like you don't have commit access yet. Thanks, and correct, I don't have commit access yet (and I'm not sure that I should! :) ) If accepted, I'd like to request this change be applied to gcc 9 as it is needed for one of the char8_t remediation approaches documented in P1423, and may be helpful for existing code bases impacted by the char8_t changes adopted via P0482 for C++20. - https://wg21.link/p1423#emulate Seems reasonable. It may be too late to make 9.2 at this point, though. Is there anything I can/should do to request inclusion? Tom. Jason
[PATCH 0/4]: C++ P1423R3 char8_t remediation implementation
This series of patches provides an implementation of the changes for C++ proposal P1423R3 [1]. These changes do not impact default libstdc++ behavior for C++17 and earlier; they are only active for C++2a or when the -fchar8_t option is specified. Tested x86_64-linux. Patch 1: Decouple constraints for u8path from path constructors. Patch 2: Update __cpp_lib_char8_t feature test macro value, add deleted operators, update u8path. Patch 3: Updates to existing tests. Patch 4: New tests. Tom. [1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html
[PATCH 1/4]: C++ P1423R3 char8_t remediation: Decouple constraints for u8path from path constructors
This patch moves helper classes and functions for std::filesystem::path out of the class definition to a detail namespace so that they are available to the implementations of std::filesystem::u8path. Prior to this patch, the SFINAE constraints for those implementations were specified via delegation to the overloads of path constructors with a std::locale parameter; it just so happened that those overloads had the same constraints. As of P1423R3, u8path and those overloads no longer have the same constraints, so this dependency must be broken. This patch also updates the experimental implementation of the filesystem TS to add SFINAE constraints to its implementations of u8path. These functions were previously unconstrained and marked with a TODO comment. This patch does not provide any intentional behavioral changes other than the added constraints to the experimental filesystem TS implementation of u8path. I recommend applying the patch and viewing the diff with white space ignored when reviewing; there will be many fewer differences this way. Alternatives to this refactoring would have been to make the u8path overloads friends of class path, or to make the helpers public members. Both of those approaches struck me as less desirable than this approach, though this approach does require more code changes and will affect implementation detail portions of mangled names for path constructors and inline member functions (mostly function template specializations). libstdc++-v3/ChangeLog: 2019-09-15 Tom Honermann * include/bits/fs_path.h: Moved helper utilities out of std::filesystem::path into a detail namespace to make them available for use by u8path. * include/experimental/bits/fs_path.h: Moved helper utilities out of std::experimental::filesystem::v1::path into a detail namespace to make them available for use by u8path. Tom. diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h index e1083acf30f..71354515403 100644 --- a/libstdc++-v3/include/bits/fs_path.h +++ b/libstdc++-v3/include/bits/fs_path.h @@ -59,103 +59,114 @@ namespace filesystem { _GLIBCXX_BEGIN_NAMESPACE_CXX11 - /** @addtogroup filesystem + class path; + +namespace __detail +{ + /** @addtogroup filesystem-detail * @{ */ - /// A filesystem path. - class path - { -template - using __is_encoded_char = __is_one_of, - char, + template +using __is_encoded_char = __is_one_of, + char, #ifdef _GLIBCXX_USE_CHAR8_T - char8_t, + char8_t, #endif #if _GLIBCXX_USE_WCHAR_T - wchar_t, + wchar_t, #endif - char16_t, char32_t>; + char16_t, char32_t>; -template> - using __is_path_iter_src - = __and_<__is_encoded_char, - std::is_base_of>; + template> +using __is_path_iter_src + = __and_<__is_encoded_char, + std::is_base_of>; -template - static __is_path_iter_src<_Iter> - __is_path_src(_Iter, int); - -template - static __is_encoded_char<_CharT> - __is_path_src(const basic_string<_CharT, _Traits, _Alloc>&, int); - -template - static __is_encoded_char<_CharT> - __is_path_src(const basic_string_view<_CharT, _Traits>&, int); + template +static __is_path_iter_src<_Iter> +__is_path_src(_Iter, int); -template - static std::false_type - __is_path_src(const _Unknown&, ...); - -template - struct __constructible_from; - -template - struct __constructible_from<_Iter, _Iter> - : __is_path_iter_src<_Iter> - { }; + template +static __is_encoded_char<_CharT> +__is_path_src(const basic_string<_CharT, _Traits, _Alloc>&, int); -template - struct __constructible_from<_Source, void> - : decltype(__is_path_src(std::declval<_Source>(), 0)) - { }; + template +static __is_encoded_char<_CharT> +__is_path_src(const basic_string_view<_CharT, _Traits>&, int); -template - using _Path = typename - std::enable_if<__and_<__not_, path>>, - __not_>>, - __constructible_from<_Tp1, _Tp2>>::value, - path>::type; + template +static std::false_type +__is_path_src(const _Unknown&, ...); -template - static _Source - _S_range_begin(_Source __begin) { return __begin; } + template +struct __constructible_from; -struct __null_terminated { }; + template +struct __constructible_from<_Iter, _Iter> +: __is_path_iter_src<_Iter> +{ }; -template - static __null_terminated - _S_range_end(_Source) { return {}; } + template +struct __constructible_from<_Source, void> +: decltype(__is_path_src(std::declval<_Source>(), 0)) +{ }; -template - static const _CharT* - _S_range_beg
[PATCH 2/4]: C++ P1423R3 char8_t remediation: Update feature test macro, add deleted operators, update u8path
This patch increments the __cpp_lib_char8_t feature test macro, adds deleted operator<< overloads for basic_ostream, and modifies u8path to accept sequences of char8_t for both the C++17 implementation of std::filesystem, and the filesystem TS implementation. The implementation mechanism used for u8path differs between the C++17 and filesystem TS implementations. The changes to the former take advantage of C++17 'if constexpr'. The changes to the latter retain C++11 compatibility and rely on tag dispatching. libstdc++-v3/ChangeLog: 2019-09-15 Tom Honermann * libstdc++-v3/include/bits/c++config: Bumped the value of the __cpp_lib_char8_t feature test macro. * libstdc++-v3/include/bits/fs_path.h (u8path): Modified u8path to accept sequences of char8_t. * libstdc++-v3/include/experimental/bits/fs_path.h (u8path): Modified u8path to accept sequences of char8_t. * libstdc++-v3/include/std/ostream: Added deleted overloads of wchar_t, char8_t, char16_t, and char32_t for ordinary and wide formatted character and string inserters. Tom. diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config index c8e099aaadd..5bcf32d95ef 100644 --- a/libstdc++-v3/include/bits/c++config +++ b/libstdc++-v3/include/bits/c++config @@ -620,7 +620,7 @@ namespace std # endif #endif #ifdef _GLIBCXX_USE_CHAR8_T -# define __cpp_lib_char8_t 201811L +# define __cpp_lib_char8_t 201907L #endif /* Define if __float128 is supported on this host. */ diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h index 71354515403..f3f539412fc 100644 --- a/libstdc++-v3/include/bits/fs_path.h +++ b/libstdc++-v3/include/bits/fs_path.h @@ -153,9 +153,24 @@ namespace __detail template())), - typename _Val = typename std::iterator_traits<_Iter>::value_type> + typename _Val = typename std::iterator_traits<_Iter>::value_type, + typename _UnqualVal = std::remove_const_t<_Val>> using __value_type_is_char - = std::enable_if_t, char>>; + = std::enable_if_t, + _UnqualVal>; + + template())), + typename _Val = typename std::iterator_traits<_Iter>::value_type, + typename _UnqualVal = std::remove_const_t<_Val>> +using __value_type_is_char_or_char8_t + = std::enable_if_t<__or_v< + std::is_same<_UnqualVal, char> +#ifdef _GLIBCXX_USE_CHAR8_T + ,std::is_same<_UnqualVal, char8_t> +#endif + >, + _UnqualVal>; // @} group filesystem-detail } // namespace __detail @@ -639,29 +654,41 @@ namespace __detail /// Create a path from a UTF-8-encoded sequence of char template, - typename _Require2 = __detail::__value_type_is_char<_InputIterator>> + typename _CharT = + __detail::__value_type_is_char_or_char8_t<_InputIterator>> inline path u8path(_InputIterator __first, _InputIterator __last) { #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS - // XXX This assumes native wide encoding is UTF-16. - std::codecvt_utf8_utf16 __cvt; - path::string_type __tmp; - if constexpr (is_pointer_v<_InputIterator>) +#ifdef _GLIBCXX_USE_CHAR8_T + if constexpr (is_same_v<_CharT, char8_t>) { - if (__str_codecvt_in_all(__first, __last, __tmp, __cvt)) - return path{ __tmp }; + return path{ __first, __last }; } else { - const std::string __u8str{__first, __last}; - const char* const __ptr = __u8str.data(); - if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt)) - return path{ __tmp }; +#endif + // XXX This assumes native wide encoding is UTF-16. + std::codecvt_utf8_utf16 __cvt; + path::string_type __tmp; + if constexpr (is_pointer_v<_InputIterator>) + { + if (__str_codecvt_in_all(__first, __last, __tmp, __cvt)) + return path{ __tmp }; + } + else + { + const std::string __u8str{__first, __last}; + const char* const __ptr = __u8str.data(); + if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt)) + return path{ __tmp }; + } + _GLIBCXX_THROW_OR_ABORT(filesystem_error( + "Cannot convert character sequence", + std::make_error_code(errc::illegal_byte_sequence))); +#ifdef _GLIBCXX_USE_CHAR8_T } - _GLIBCXX_THROW_OR_ABORT(filesystem_error( - "Cannot convert character sequence", - std::make_error_code(errc::illegal_byte_sequence))); +#endif #else // This assumes native normal encoding is UTF-8. return path{ __first, __last }; @@ -671,21 +698,32 @@ namespace __detail /// Create a path from a UTF-8-encoded sequence of char template, - typename _Require2 = __detail::__value_type_is_char<_Source>> + typename _CharT = __detail::__value_type_is_char_or_char8_t<_Source>> inline path u8path(const _Source& __source) { #ifdef
[PATCH 3/4]: C++ P1423R3 char8_t remediation: Updates to existing tests
This patch updates existing tests to validate the new value for the __cpp_lib_char8_t feature test macros and to exercise u8path factory function invocations with std::string, std::string_view, and interator pair arguments. libstdc++-v3/ChangeLog: 2019-09-15 Tom Honermann * libstdc++-v3/testsuite/experimental/feat-char8_t.cc: Updated the expected __cpp_lib_char8_t feature test macro value. * libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc: Added testing of u8path invocation with std::string, std::string_view, and iterators thereof. * libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc: Added testing of u8path invocation with std::string, std::string_view, and iterators thereof. Tom. diff --git a/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc index aff722b5867..fb337ce1284 100644 --- a/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc +++ b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc @@ -19,6 +19,7 @@ // { dg-do run { target c++17 } } #include +#include #include namespace fs = std::filesystem; @@ -34,6 +35,22 @@ test01() p = fs::u8path("\xf0\x9d\x84\x9e"); VERIFY( p.u8string() == u8"\U0001D11E" ); + + std::string s1 = "filename2"; + p = fs::u8path(s1); + VERIFY( p.u8string() == u8"filename2" ); + + std::string s2 = "filename3"; + p = fs::u8path(s2.begin(), s2.end()); + VERIFY( p.u8string() == u8"filename3" ); + + std::string_view sv1{ s1 }; + p = fs::u8path(sv1); + VERIFY( p.u8string() == u8"filename2" ); + + std::string_view sv2{ s2 }; + p = fs::u8path(sv2.begin(), sv2.end()); + VERIFY( p.u8string() == u8"filename3" ); } void diff --git a/libstdc++-v3/testsuite/experimental/feat-char8_t.cc b/libstdc++-v3/testsuite/experimental/feat-char8_t.cc index e843604266c..c9b277a4626 100644 --- a/libstdc++-v3/testsuite/experimental/feat-char8_t.cc +++ b/libstdc++-v3/testsuite/experimental/feat-char8_t.cc @@ -12,6 +12,6 @@ #ifndef __cpp_lib_char8_t # error "__cpp_lib_char8_t" -#elif __cpp_lib_char8_t != 201811L -# error "__cpp_lib_char8_t != 201811L" +#elif __cpp_lib_char8_t != 201907L +# error "__cpp_lib_char8_t != 201907L" #endif diff --git a/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc b/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc index bdeb3946a15..83219b7ddda 100644 --- a/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc +++ b/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc @@ -35,6 +35,14 @@ test01() p = fs::u8path("\xf0\x9d\x84\x9e"); VERIFY( p.u8string() == u8"\U0001D11E" ); + + std::string s1 = "filename2"; + p = fs::u8path(s1); + VERIFY( p.u8string() == u8"filename2" ); + + std::string s2 = "filename3"; + p = fs::u8path(s2.begin(), s2.end()); + VERIFY( p.u8string() == u8"filename3" ); } void
[PATCH 4/4]: C++ P1423R3 char8_t remediation: New tests
This patch adds new tests to validate new deleted overloads of wchar_t, char8_t, char16_t, and char32_t for ordinary and wide formatted character and string ostream inserters. Additionally, new tests are added to validate invocations of u8path with sequences of char8_t for both the C++17 and filesystem TS implementations. libstdc++-v3/ChangeLog: 2019-09-15 Tom Honermann * libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc: New test to validate deleted overloads of character and string inserters for narrow ostreams. * libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc: New test to validate deleted overloads of character and string inserters for wide ostreams. * libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc: New test to validate u8path invocations with sequences of char8_t. * libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc New test to validate u8path invocations with sequences of char8_t. Tom. diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc new file mode 100644 index 000..87afb295086 --- /dev/null +++ b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc @@ -0,0 +1,43 @@ +// Copyright (C) 2019 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// <http://www.gnu.org/licenses/>. + +// 29.7.2 Header synopsys; deleted character inserters. + +// Test character inserters defined as deleted by P1423. + +// { dg-options "-std=gnu++17 -fchar8_t" } +// { dg-do compile { target c++17 } } + +#include + +void test_character_inserters(std::ostream &os) +{ + os << 'x'; // ok. + os << L'x'; // { dg-error "use of deleted function" } + os << u8'x'; // { dg-error "use of deleted function" } + os << u'x'; // { dg-error "use of deleted function" } + os << U'x'; // { dg-error "use of deleted function" } +} + +void test_string_inserters(std::ostream &os) +{ + os << "text"; // ok. + os << L"text"; // { dg-error "use of deleted function" } + os << u8"text"; // { dg-error "use of deleted function" } + os << u"text"; // { dg-error "use of deleted function" } + os << U"text"; // { dg-error "use of deleted function" } +} diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc new file mode 100644 index 000..701de16822b --- /dev/null +++ b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc @@ -0,0 +1,43 @@ +// Copyright (C) 2019 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// <http://www.gnu.org/licenses/>. + +// 29.7.2 Header synopsys; deleted character inserters. + +// Test wide character inserters defined as deleted by P1423. + +// { dg-options "-std=gnu++17 -fchar8_t" } +// { dg-do compile { target c++17 } } + +#include + +void test_character_inserters(std::wostream &os) +{ + os << 'x'; // ok. + os << L'x'; // ok. + os << u8'x'; // { dg-error "use of deleted function" } + os << u'x'; // { dg-error "use of d
Re: [PATCH 2/4]: C++ P1423R3 char8_t remediation: Update feature test macro, add deleted operators, update u8path
A revised patch is attached that adds proper preprocessor conditionals around the deleted ostream inserters. Apparently I had previously implemented a quick hack for testing purposes, neglected to add a FIXME comment, and then forgot about the hack. Shame on me. Tom. On 9/15/19 3:39 PM, Tom Honermann wrote: This patch increments the __cpp_lib_char8_t feature test macro, adds deleted operator<< overloads for basic_ostream, and modifies u8path to accept sequences of char8_t for both the C++17 implementation of std::filesystem, and the filesystem TS implementation. The implementation mechanism used for u8path differs between the C++17 and filesystem TS implementations. The changes to the former take advantage of C++17 'if constexpr'. The changes to the latter retain C++11 compatibility and rely on tag dispatching. libstdc++-v3/ChangeLog: 2019-09-15 Tom Honermann * libstdc++-v3/include/bits/c++config: Bumped the value of the __cpp_lib_char8_t feature test macro. * libstdc++-v3/include/bits/fs_path.h (u8path): Modified u8path to accept sequences of char8_t. * libstdc++-v3/include/experimental/bits/fs_path.h (u8path): Modified u8path to accept sequences of char8_t. * libstdc++-v3/include/std/ostream: Added deleted overloads of wchar_t, char8_t, char16_t, and char32_t for ordinary and wide formatted character and string inserters. Tom. diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config index c8e099aaadd..5bcf32d95ef 100644 --- a/libstdc++-v3/include/bits/c++config +++ b/libstdc++-v3/include/bits/c++config @@ -620,7 +620,7 @@ namespace std # endif #endif #ifdef _GLIBCXX_USE_CHAR8_T -# define __cpp_lib_char8_t 201811L +# define __cpp_lib_char8_t 201907L #endif /* Define if __float128 is supported on this host. */ diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h index 71354515403..f3f539412fc 100644 --- a/libstdc++-v3/include/bits/fs_path.h +++ b/libstdc++-v3/include/bits/fs_path.h @@ -153,9 +153,24 @@ namespace __detail template())), - typename _Val = typename std::iterator_traits<_Iter>::value_type> + typename _Val = typename std::iterator_traits<_Iter>::value_type, + typename _UnqualVal = std::remove_const_t<_Val>> using __value_type_is_char - = std::enable_if_t, char>>; + = std::enable_if_t, + _UnqualVal>; + + template())), + typename _Val = typename std::iterator_traits<_Iter>::value_type, + typename _UnqualVal = std::remove_const_t<_Val>> +using __value_type_is_char_or_char8_t + = std::enable_if_t<__or_v< + std::is_same<_UnqualVal, char> +#ifdef _GLIBCXX_USE_CHAR8_T + ,std::is_same<_UnqualVal, char8_t> +#endif + >, + _UnqualVal>; // @} group filesystem-detail } // namespace __detail @@ -639,29 +654,41 @@ namespace __detail /// Create a path from a UTF-8-encoded sequence of char template, - typename _Require2 = __detail::__value_type_is_char<_InputIterator>> + typename _CharT = + __detail::__value_type_is_char_or_char8_t<_InputIterator>> inline path u8path(_InputIterator __first, _InputIterator __last) { #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS - // XXX This assumes native wide encoding is UTF-16. - std::codecvt_utf8_utf16 __cvt; - path::string_type __tmp; - if constexpr (is_pointer_v<_InputIterator>) +#ifdef _GLIBCXX_USE_CHAR8_T + if constexpr (is_same_v<_CharT, char8_t>) { - if (__str_codecvt_in_all(__first, __last, __tmp, __cvt)) - return path{ __tmp }; + return path{ __first, __last }; } else { - const std::string __u8str{__first, __last}; - const char* const __ptr = __u8str.data(); - if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt)) - return path{ __tmp }; +#endif + // XXX This assumes native wide encoding is UTF-16. + std::codecvt_utf8_utf16 __cvt; + path::string_type __tmp; + if constexpr (is_pointer_v<_InputIterator>) + { + if (__str_codecvt_in_all(__first, __last, __tmp, __cvt)) + return path{ __tmp }; + } + else + { + const std::string __u8str{__first, __last}; + const char* const __ptr = __u8str.data(); + if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt)) + return path{ __tmp }; + } + _GLIBCXX_THROW_OR_ABORT(filesystem_error( + "Cannot convert character sequence", + std::make_error_code(errc::illegal_byte_sequence))); +#ifdef _GLIBCXX_USE_CHAR8_T } - _GLIBCXX_THROW_OR_ABORT(filesystem_error( - "Cannot convert character sequence", - std::make_error_code(errc::illegal_byte_sequence))); +#endif #else // This assumes native normal encoding is UTF-8. return path{ __first, __last }; @@
Re: [PATCH 4/4]: C++ P1423R3 char8_t remediation: New tests
A revised patch is attached that modifies the tests for deleted ostream inserters to require C++2a. This is required by the revision of patch 2/4 that adds proper preprocessor conditionals to the definitions. Tom. On 9/15/19 3:40 PM, Tom Honermann wrote: This patch adds new tests to validate new deleted overloads of wchar_t, char8_t, char16_t, and char32_t for ordinary and wide formatted character and string ostream inserters. Additionally, new tests are added to validate invocations of u8path with sequences of char8_t for both the C++17 and filesystem TS implementations. libstdc++-v3/ChangeLog: 2019-09-15 Tom Honermann * libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc: New test to validate deleted overloads of character and string inserters for narrow ostreams. * libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc: New test to validate deleted overloads of character and string inserters for wide ostreams. * libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc: New test to validate u8path invocations with sequences of char8_t. * libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc New test to validate u8path invocations with sequences of char8_t. Tom. commit b7eb4714cc2c999ce0491358fcbcebf4a8723185 Author: Tom Honermann Date: Sun Sep 15 22:25:28 2019 -0400 P1423R3 char8_t remediation: New tests This patch adds new tests to validate new deleted overloads of wchar_t, char8_t, char16_t, and char32_t for ordinary and wide formatted character and string ostream inserters. Additionally, new tests are added to validate invocations of u8path with sequences of char8_t for both the C++17 and filesystem TS implementations. diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc new file mode 100644 index 000..f2eb538f42e --- /dev/null +++ b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc @@ -0,0 +1,43 @@ +// Copyright (C) 2019 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// <http://www.gnu.org/licenses/>. + +// 29.7.2 Header synopsys; deleted character inserters. + +// Test character inserters defined as deleted by P1423. + +// { dg-options "-std=gnu++2a" } +// { dg-do compile { target c++2a } } + +#include + +void test_character_inserters(std::ostream &os) +{ + os << 'x'; // ok. + os << L'x'; // { dg-error "use of deleted function" } + os << u8'x'; // { dg-error "use of deleted function" } + os << u'x'; // { dg-error "use of deleted function" } + os << U'x'; // { dg-error "use of deleted function" } +} + +void test_string_inserters(std::ostream &os) +{ + os << "text"; // ok. + os << L"text"; // { dg-error "use of deleted function" } + os << u8"text"; // { dg-error "use of deleted function" } + os << u"text"; // { dg-error "use of deleted function" } + os << U"text"; // { dg-error "use of deleted function" } +} diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc new file mode 100644 index 000..1422a01aab3 --- /dev/null +++ b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc @@ -0,0 +1,43 @@ +// Copyright (C) 2019 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY
[PATCH 0/9]: C++ P0482R5 char8_t implementation
This series of patches provides an implementation of the core language and library changes for C++ proposal P0482R5 [1]. These changes are believed to be complete with the exception of the proposed mbrtoc8() and c8rtomb() functions (the expectation is that the C library will provide mbrtoc8() and c8rtomb(); future patches will address that support and integration). These changes do not impact default gcc behavior. A new -fchar8_t option is provided to enable the P0482R5 changes, and -fno-char8_t is provided to explicitly disable them. Patch 1: Documentation updates Patch 2: Core language support Patch 3: New core language tests Patch 4: Updates to existing core language tests Patch 5: Standard library support Patch 6: A small correction to a common testsuite header file Patch 7: New standard library tests Patch 8: Updates to existing standard library tests Patch 9: Updates to gdb pretty printing support Tom. [1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0482r5.html
[PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates
This patch adds documentation for new -fchar8_t and -fno-char8_t options. gcc/ChangeLog: 2018-11-04 Tom Honermann * doc/invoke.texi (-fchar8_t): Document new option. Tom. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 57491f1033c..cd3a2a715db 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -206,7 +206,7 @@ in the following sections. @item C++ Language Options @xref{C++ Dialect Options,,Options Controlling C++ Dialect}. @gccoptlist{-fabi-version=@var{n} -fno-access-control @gol --faligned-new=@var{n} -fargs-in-order=@var{n} -fcheck-new @gol +-faligned-new=@var{n} -fargs-in-order=@var{n} -fchar8_t -fcheck-new @gol -fconstexpr-depth=@var{n} -fconstexpr-loop-limit=@var{n} @gol -fno-elide-constructors @gol -fno-enforce-eh-specs @gol @@ -2432,6 +2432,53 @@ but few users will need to override the default of This flag is enabled by default for @option{-std=c++17}. +@item -fchar8_t +@itemx -fno-char8_t +@opindex fchar8_t +@opindex fno-char8_t +Enable support for the P0482 proposal including the addition of a +new @code{char8_t} fundamental type, changes to the types of UTF-8 +string and character literals, new signatures for user defined +literals, and new specializations of standard library class templates +@code{std::numeric_limits}, @code{std::char_traits}, +and @code{std::hash}. + +This option enables functions to be overloaded for ordinary and UTF-8 +strings: + +@smallexample +int f(const char *);// #1 +int f(const char8_t *); // #2 +int v1 = f("text"); // Calls #1 +int v2 = f(u8"text"); // Calls #2 +@end smallexample + +and introduces new signatures for user defined literals: + +@smallexample +int operator""_udl1(char8_t); +int v3 = u8'x'_udl1; +int operator""_udl2(const char8_t*, std::size_t); +int v4 = u8"text"_udl2; +template int operator""_udl3(); +int v5 = u8"text"_udl3; +@end smallexample + +The change to the types of UTF-8 string and character literals introduces +incompatibilities with ISO C++11 and later standards. For example, the +following code is well-formed under ISO C++11, but is ill-formed when +@option{-fchar8_t} is specified. + +@smallexample +char ca[] = u8"text"; // error: char-array initialized from wide string +const char *cp = u8"text"; // error: invalid conversion from 'const char8_t*' to 'const char*' +int f(const char*); +auto v = f(u8"text"); // error: invalid conversion from 'const char8_t*' to 'const char*' +std::string s1@{u8"text"@}; // error: no matching function for call to 'std::basic_string::basic_string()' +using namespace std::literals; +std::string s2 = u8"text"s; // error: conversion from 'basic_string' to non-scalar type 'basic_string' requested +@end smallexample + @item -fcheck-new @opindex fcheck-new Check that the pointer returned by @code{operator new} is non-null
[PATCH 2/9]: C++ P0482R5 char8_t: Core language support
This patch adds support for the P0482R5 core language changes. This includes: - The -fchar8_t and -fno_char8_t command line options. - char8_t as a keyword. - The char8_t builtin type as a non-aliasing unsigned integral character type of size 1. - Use of char8_t as a simple type specifier. - u8 character literals with type char8_t. - u8 string literals with type array of const char8_t. - User defined literal operators that accept char8_1 and char8_t pointer types. - New __cpp_char8_t predefined feature test macro. - New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macros . - Name mangling and demangling for char8_t (using Du). gcc/ChangeLog: 2018-11-04 Tom Honermann * defaults.h: Define CHAR8_TYPE. gcc/c-family/ChangeLog: 2018-11-04 Tom Honermann * c-family/c-common.c (c_common_reswords): Add char8_t. (fix_string_type): Use char8_t for the type of u8 string literals. (c_common_get_alias_set): char8_t doesn't alias. (c_common_nodes_and_builtins): Define char8_t as a builtin type in C++. (c_stddef_cpp_builtins): Add __CHAR8_TYPE__. (keyword_begins_type_specifier): Add RID_CHAR8. * gcc/c-family/c-common.h (rid): Add RID_CHAR8. (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE. Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS. Define char8_type_node and char8_array_type_node. * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine __GCC_ATOMIC_CHAR8_T_LOCK_FREE. (c_cpp_builtins): Predefine __cpp_char8_t. * c-family/c-lex.c (lex_string): Use char8_array_type_node as the type of CPP_UTF8STRING. (lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR. * c-family/c.opt: Add the -fchar8_t command line option. gcc/c/ChangeLog: 2018-11-04 Tom Honermann * c/c-typeck.c (char_type_p): Add char8_type_node. (digest_init): Handle initialization by a u8 string literal of char8_t type. gcc/cp/ChangeLog: 2018-11-04 Tom Honermann * cp/cvt.c (type_promotes_to): Handle char8_t promotion. * cp/decl.c (grokdeclarator): Handle invalid type specifier combinations involving char8_t. * cp/lex.c (init_reswords): Add char8_t as a reserved word. * cp/mangle.c (write_builtin_type): Add name mangling for char8_t (Du). * cp/parser.c (cp_keyword_starts_decl_specifier_p, cp_parser_simple_type_specifier): Recognize char8_t as a simple type specifier. (cp_parser_string_literal): Use char8_array_type_node for the type of CPP_UTF8STRING. (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in system headers. * cp/rtti.c (emit_support_tinfos): type_info support for char8_t. * cp/tree.c (char_type_p): Recognize char8_t as a character type. * cp/typeck.c (string_conv_p): Handle conversions of u8 string literals of char8_t type. (check_literal_operator_args): Handle UDLs with u8 string literals of char8_t type. * cp/typeck2.c (digest_init_r): Disallow initializing a char array with a u8 string literal. libiberty/ChangeLog: 2018-10-31 Tom Honermann * cp-demangle.c (cplus_demangle_builtin_types, cplus_demangle_type): Add name demangling for char8_t (Du). * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate the new char8_t type. Tom. diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index f10cf89c3a7..c7d88eb9a22 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -79,6 +79,7 @@ machine_mode c_default_pointer_mode = VOIDmode; tree signed_char_type_node; tree wchar_type_node; + tree char8_type_node; tree char16_type_node; tree char32_type_node; @@ -128,6 +129,11 @@ machine_mode c_default_pointer_mode = VOIDmode; tree wchar_array_type_node; + Type `char8_t[SOMENUMBER]' or something like it. + Used when a UTF-8 string literal is created. + + tree char8_array_type_node; + Type `char16_t[SOMENUMBER]' or something like it. Used when a UTF-16 string literal is created. @@ -450,6 +456,7 @@ const struct c_common_resword c_common_reswords[] = { "case", RID_CASE, 0 }, { "catch", RID_CATCH, D_CXX_OBJC | D_CXXWARN }, { "char", RID_CHAR, 0 }, + { "char8_t", RID_CHAR8, D_CXX_CHAR8_T_FLAGS | D_CXXWARN }, { "char16_t", RID_CHAR16, D_CXXONLY | D_CXX11 | D_CXXWARN }, { "char32_t", RID_CHAR32, D_CXXONLY | D_CXX11 | D_CXXWARN }, { "class", RID_CLASS, D_CXX_OBJC | D_CXXWARN }, @@ -746,6 +753,11 @@ fix_string_type (tree value) nchars = length; e_type = char_type_node; } + else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node) +{ + nchars = length / (TYPE_PRECISION (char8_type_node) / BITS_PER_UNIT); + e_type = char8_type_node; +} else if (TREE_TYPE (value) == char16_array_type_node) { nchars = length / (TYPE_P
[PATCH 3/9]: C++ P0482R5 char8_t: New core language tests
This patch adds new tests to exercise new behavior for when support for char8_t is enabled as well as protect against unintended behavioral impact when support for char8_t is not enabled. In some cases, existing tests suffice to exercise existing behavior and such tests have been cloned to validate behavior when char8_t is enabled. In other cases, tests are added to validate behavior both when char8_t support is and is not enabled. gcc/testsuite/ChangeLog: 2018-11-04 Tom Honermann * g++.dg/cpp0x/udlit-implicit-conv-neg-char8_t.C: New test cloned from udlit-implicit-conv-neg.C. Validates handling of ill-formed uses of char8_t based user defined literals. * g++.dg/cpp0x/udlit-resolve-char8_t.C: New test cloned from udlit-resolve.C. Validates handling of well-formed uses of char8_t based user defined literals. * g++.dg/ext/char8_t-aliasing-1.C: New test; validates warnings for type punning with char8_t types. Illustrates that char8_t does not alias. * g++.dg/ext/char8_t-char-literal-1.C: New test; validates u8 character literals have type char if char8_t support is not enabled. * g++.dg/ext/char8_t-char-literal-2.C: New test; validates u8 character literals have type char8_t if char8_t support is enabled. * g++.dg/ext/char8_t-deduction-1.C: New test; validates char is deduced for u8 character and string literals if char8_t support is not enabled. * g++.dg/ext/char8_t-deduction-2.C: New test; validates char8_t is deduced for u8 character and string literals if char8_t support is enabled. * g++.dg/ext/char8_t-feature-test-macro-1.C: New test; validates that the __cpp_char8_t feature test macro is not defined if char8_t support is not enabled. * g++.dg/ext/char8_t-feature-test-macro-2.C: New test; validates that the __cpp_char8_t feature test macro is defined with the correct value if char8_t support is enabled. * g++.dg/ext/char8_t-init-1.C: New test; validates initialization by u8 character and string literals when support for char8_t is not enabled. * g++.dg/ext/char8_t-init-2.C: New test; validates initialization by u8 character and string literals when support for char8_t is enabled. * g++.dg/ext/char8_t-keyword-1.C: New test; validates that char8_t is not a keyword if support for char8_t is not enabled. * g++.dg/ext/char8_t-keyword-2.C: New test; validates that char8_t is a keyword if support for char8_t is enabled. * g++.dg/ext/char8_t-limits-1.C: New test; validates that char8_t is unsigned and sufficiently large to store the required range of char8_t values. * g++.dg/ext/char8_t-overload-1.C: New test; validates overload resolution for u8 character and string literal arguments when support for char8_t is not enabled. * g++.dg/ext/char8_t-overload-2.C: New test; validates overload resolution for u8 character and string literal arguments when support for char8_t is enabled. * g++.dg/ext/char8_t-predefined-macros-1.C: New test; validates that the __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macros are not defined when support for char8_t is not enabled. * g++.dg/ext/char8_t-predefined-macros-2.C: New test; validates that the __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macros are defined when support for char8_t is enabled. * g++.dg/ext/char8_t-sizeof-1.C: New test; validates that the size of char8_t and u8 character literals is 1. * g++.dg/ext/char8_t-specialization-1.C: New test; validate template specialization for u8 character literal template arguments when support for char8_t is not enabled. * g++.dg/ext/char8_t-specialization-2.C: New test; validate template specialization for char8_t and u8 character literal template arguments when support for char8_t is enabled. * g++.dg/ext/char8_t-string-literal-1.C: New test; validate the type of u8 string literals when support for char8_t is not enabled. * g++.dg/ext/char8_t-string-literal-2.C: New test; validate the type of u8 string literals when support for char8_t is enabled. * g++.dg/ext/char8_t-type-specifier-1.C: New test; validate that char8_t is not recognized as a type specifier when support for char8_t is not enabled. * g++.dg/ext/char8_t-type-specifier-2.C: New test; validate that char8_t is recognized as a type specifier when support for char8_t is enabled. * g++.dg/ext/char8_t-typedef-1.C: New test; validate declarations of char8_t as a typedef are accepted when support for char8_t is not enabled. * g++.dg/ext/char8_t-typedef-2.C: New test; validate declarations of char8_t as a typedef are not accepted when support for char8_t is enabled. * g++.dg/ext/char8_t-udl-1.C: New test; validates overloading for u8
[PATCH 6/9]: C++ P0482R5 char8_t: A small correction to a common testsuite header file
This patch corrects ambiguous partial specializations of typelist::detail::append_. Previously, neither append_, Typelist_Chain> nor append_ was a better match for append_, null_type>. libstdc++-v3/ChangeLog: 2018-11-04 Tom Honermann * include/ext/typelist.h: Constrained a partial specialization of typelist::detail::append_ to only match chain. Tom. diff --git a/libstdc++-v3/include/ext/typelist.h b/libstdc++-v3/include/ext/typelist.h index b21f01ffb43..2cdbc3efafa 100644 --- a/libstdc++-v3/include/ext/typelist.h +++ b/libstdc++-v3/include/ext/typelist.h @@ -215,10 +215,10 @@ namespace detail typedef Typelist_Chain type; }; - template -struct append_ + template +struct append_, null_type> { - typedef Typelist_Chain type; + typedef chain type; }; template<>
[PATCH 5/9]: C++ P0482R5 char8_t: Standard library support
This patch adds support to libstdc++ for the P0482R5 standard library changes. This includes: - New char8_t based specializations: - std::numeric_limits - std::char_traits - std::hash - std::hash - std::hash - std::codecvt - std::codecvt - std::codecvt_byname - std::codecvt_byname - New char8_t overloads: - u8string operator "" s(const char8_t* str, size_t len); - u8string_view operator""sv(const char8_t* str, size_t len); - New type aliases: - std::u8string - std::u8string_view - std::atomic_char8_t - Changed function signatures: - filesystem::path::u8string() returns u8string. - filesystem::path::generic_u8string() returns u8string. - typeinfo for char8_t. - New macros: - __cpp_lib_char8_t - ATOMIC_CHAR8_T_LOCK_FREE For types and templates that existed in an experimental form prior to standardization, both the experimental and standardized variants have been updated. The updates to the experimental versions are optional. I'm not very familiar with how ABI versioning is done and I'm not confident that the changes in the .ver files are correct. In particular, I'm unsure as to whether a CXXABI_3.0 section may be needed in gnu-versioned-namespace.ver and whether I'm correct in adding a new CXXABI_1.3.12 section in gnu.ver. If I'm not mistaken, CXXABI has not already been bumped for gcc 9, so needs to be, but GLIBCXX has already been bumped and therefore does not need to be. gcc/cp/ChangeLog: 2018-11-04 Tom Honermann * name-lookup.c (get_std_name_hint): Added u8string as a name hint. libstdc++-v3/ChangeLog: 2018-11-04 Tom Honermann * config/abi/pre/gnu-versioned-namespace.ver (CXXABI_2.0): Add typeinfo symbols for char8_t. * config/abi/pre/gnu.ver: Add CXXABI_1.3.12. (GLIBCXX_3.4.26): Add symbols for specializations of numeric_limits and codecvt that involve char8_t. (CXXABI_1.3.12): Add typeinfo symbols for char8_t. * include/bits/atomic_base.h: Add atomic_char8_t. * include/bits/basic_string.h: Add std::hash and operator""s(const char8_t*, size_t). * include/bits/c++config: Define _GLIBCXX_USE_CHAR8_T and __cpp_lib_char8_t. * include/bits/char_traits.h: Add char_traits. * include/bits/codecvt.h: Add codecvt, codecvt, codecvt_byname, and codecvt_byname. * include/bits/cpp_type_traits.h: Add __is_integer to recognize char8_t as an integral type. * include/bits/fs_path.h: (path::__is_encoded_char): Recognize char8_t. (path::u8string): Return std::u8string when char8_t support is enabled. (path::generic_u8string): Likewise. (path::_S_convert): Handle conversion from char8_t input. (path::_S_str_convert): Likewise. * include/bits/functional_hash.h: Add hash. * include/bits/locale_conv.h (__str_codecvt_out): Add overloads for char8_t. * include/bits/locale_facets.h (_GLIBCXX_NUM_UNICODE_FACETS): Bump for new char8_t specializations. * include/bits/localefwd.h: Add missing declarations of codecvt and codecvt. Add char8_t declarations codecvt and codecvt. * include/bits/postypes.h: Add u8streampos * include/bits/stringfwd.h: Add declarations of char_traits and u8string. * include/c_global/cstddef: Add __byte_operand. * include/experimental/bits/fs_path.h (path::__is_encoded_char): Recognize char8_t. (path::u8string): Return std::u8string when char8_t support is enabled. (path::generic_u8string): Likewise. (path::_S_convert): Handle conversion from char8_t input. (path::_S_str_convert): Likewise. * include/experimental/string: Add u8string. * include/experimental/string_view: Add u8string_view, hash, and operator""sv(const char8_t*, size_t). * include/std/atomic: Add atomic and atomic_char8_t. * include/std/charconv (__is_int_to_chars_type): Recognize char8_t as a character type. * include/std/limits: Add numeric_limits. * include/std/string_view: Add u8string_view, hash, and operator""sv(const char8_t*, size_t). * include/std/type_traits: Add __is_integral_helper, __make_unsigned, and __make_signed. * libsupc++/atomic_lockfree_defines.h: Define ATOMIC_CHAR8_T_LOCK_FREE. * src/c++11/Makefile.am: Compile with -fchar8_t when compiling codecvt.cc and limits.cc so that char8_t specializations of numeric_limits and codecvt and emitted. * src/c++11/Makefile.in: Likewise. * src/c++11/codecvt.cc: Define members of codecvt, codecvt, codecvt_byname, and codecvt_byname. * src/c++11/limits.cc: Define members of numeric_limits. * src/c++98/Makefile.am: Compile with -fchar8_t when compiling locale_init.cc and localename.cc. * src/c++98/Makefile.in: Likewise. * src/c++98/locale_init.cc: Add initialization f
[PATCH 4/9]: C++ P0482R5 char8_t: Updates to existing core language tests
This patch updates existing testing gaps related to support for u8 character and string literals. None of these changes exercise new char8_t functionality; they are intended to guard against regressions in behavior of u8 literals when support for char8_t is not enabled. gcc/testsuite/ChangeLog: 2018-11-04 Tom Honermann * c-c++-common/raw-string-13.c: Added test cases for u8 raw string literals. * c-c++-common/raw-string-15.c: Likewise. * g++.dg/cpp0x/constexpr-wstring2.C: Added test cases for u8 literals. * g++.dg/ext/utf-array-short-wchar.C: Likewise. * g++.dg/ext/utf-array.C: Likewise. * g++.dg/ext/utf-cxx98.C: Likewise. * g++.dg/ext/utf-dflt.C: Likewise. * g++.dg/ext/utf-gnuxx98.C: Likewise. * gcc.dg/utf-array-short-wchar.c: Likewise. * gcc.dg/utf-array.c: Likewise. Tom. diff --git a/gcc/testsuite/c-c++-common/raw-string-13.c b/gcc/testsuite/c-c++-common/raw-string-13.c index 1b37405cee9..fa11edaa7aa 100644 --- a/gcc/testsuite/c-c++-common/raw-string-13.c +++ b/gcc/testsuite/c-c++-common/raw-string-13.c @@ -62,6 +62,47 @@ const char s16[] = R"??(??)??"; const char s17[] = R"?(?)??)?"; const char s18[] = R"??(??)??)??)??"; +const char u800[] = u8R"??=??()??'??!??-\ +(a)#[{}]^|~"; +)??=??"; +const char u801[] = u8R"a( +)\ +a" +)a"; +const char u802[] = u8R"a( +)a\ +" +)a"; +const char u803[] = u8R"ab( +)a\ +b" +)ab"; +const char u804[] = u8R"a??/(x)a??/"; +const char u805[] = u8R"abcdefghijklmn??(abc)abcdefghijklmn??"; +const char u806[] = u8R"abcdefghijklm??/(abc)abcdefghijklm??/"; +const char u807[] = u8R"abc(??)\ +abc";)abc"; +const char u808[] = u8R"def(de)\ +def";)def"; +const char u809[] = u8R"a(??)\ +a" +)a"; +const char u810[] = u8R"a(??)a\ +" +)a"; +const char u811[] = u8R"ab(??)a\ +b" +)ab"; +const char u812[] = u8R"a#(a#)a??=)a#"; +const char u813[] = u8R"a#(??)a??=??)a#"; +const char u814[] = u8R"??/(x)??/ +";)??/"; +const char u815[] = u8R"??/(??)??/ +";)??/"; +const char u816[] = u8R"??(??)??"; +const char u817[] = u8R"?(?)??)?"; +const char u818[] = u8R"??(??)??)??)??"; + const char16_t u00[] = uR"??=??()??'??!??-\ (a)#[{}]^|~"; )??=??"; @@ -211,6 +252,25 @@ main (void) TEST (s16, "??"); TEST (s17, "?)??"); TEST (s18, "??"")??"")??"); + TEST (u800, u8"??""??"")??""'??""!??""-\\\n(a)#[{}]^|~\";\n"); + TEST (u801, u8"\n)\\\na\"\n"); + TEST (u802, u8"\n)a\\\n\"\n"); + TEST (u803, u8"\n)a\\\nb\"\n"); + TEST (u804, u8"x"); + TEST (u805, u8"abc"); + TEST (u806, u8"abc"); + TEST (u807, u8"??"")\\\nabc\";"); + TEST (u808, u8"de)\\\ndef\";"); + TEST (u809, u8"??"")\\\na\"\n"); + TEST (u810, u8"??"")a\\\n\"\n"); + TEST (u811, u8"??"")a\\\nb\"\n"); + TEST (u812, u8"a#)a??""="); + TEST (u813, u8"??"")a??""=??"); + TEST (u814, u8"x)??""/\n\";"); + TEST (u815, u8"??"")??""/\n\";"); + TEST (u816, u8"??"); + TEST (u817, u8"?)??"); + TEST (u818, u8"??"")??"")??"); TEST (u00, u"??""??"")??""'??""!??""-\\\n(a)#[{}]^|~\";\n"); TEST (u01, u"\n)\\\na\"\n"); TEST (u02, u"\n)a\\\n\"\n"); diff --git a/gcc/testsuite/c-c++-common/raw-string-15.c b/gcc/testsuite/c-c++-common/raw-string-15.c index 9dfdaabd87d..1d101dc8393 100644 --- a/gcc/testsuite/c-c++-common/raw-string-15.c +++ b/gcc/testsuite/c-c++-common/raw-string-15.c @@ -62,6 +62,47 @@ const char s16[] = R"??(??)??"; const char s17[] = R"?(?)??)?"; const char s18[] = R"??(??)??)??)??"; +const char u800[] = u8R"??=??()??'??!??-\ +(a)#[{}]^|~"; +)??=??"; +const char u801[] = u8R"a( +)\ +a" +)a"; +const char u802[] = u8R"a( +)a\ +" +)a"; +const char u803[] = u8R"ab( +)a\ +b" +)ab"; +const char u804[] = u8R"a??/(x)a??/"; +const char u805[] = u8R"abcdefghijklmn??(abc)abcdefghijklmn??"; +const char u806[] = u8R"abcdefghijklm??/(abc)abcdefghijklm??/"; +const char u807[] = u8R"abc(??)\ +abc";)abc"; +const char u808[] = u8R"def(de)\ +def";)def"; +const char u809[] = u8R"a(??)\ +a" +)a"; +const char u810[] = u8R&quo
[PATCH 8/9]: C++ P0482R5 char8_t: Updates to existing standard library tests
This patch augments existing tests to validate behavior for char8_t. In all cases, added test cases are cloned from existing tests for wchar_t or char16_t. A few tests required updates to line numbers for diagnostic messages. libstdc++-v3/ChangeLog: 2018-11-04 Tom Honermann * testsuite/18_support/byte/ops.cc: Validate std::to_integer, std::to_integer, and std::to_integer. * testsuite/18_support/numeric_limits/dr559.cc: Validate std::numeric_limits. * testsuite/18_support/numeric_limits/lowest.cc: Validate std::numeric_limits::lowest(). * testsuite/18_support/numeric_limits/max_digits10.cc: Validate std::numeric_limits::max_digits10. * testsuite/18_support/type_info/fundamental.cc: Validate typeinfo for char8_t. * testsuite/20_util/from_chars/1_neg.cc: Validate std::from_chars with char8_t. * testsuite/20_util/hash/requirements/explicit_instantiation.cc: Validate explicit instantiation of std::hash. * testsuite/20_util/is_integral/value.cc: Validate std::is_integral. * testsuite/20_util/make_signed/requirements/typedefs-4.cc: Validate std::make_signed. * testsuite/21_strings/basic_string/cons/char/deduction.cc: Validate u8string construction from char8_t sources. * testsuite/21_strings/basic_string_view/operations/compare/ char/70483.cc: Validate substr operations on u8string_view. * testsuite/21_strings/basic_string_view/typedefs.cc: Validate that the u8string_view typedef is defined. * testsuite/21_strings/char_traits/requirements/ constexpr_functions.cc: Validate char_traits constexpr member functions. * testsuite/21_strings/char_traits/requirements/ constexpr_functions_c++17.cc: Validate char_traits C++17 constexpr member functions. * testsuite/21_strings/headers/string/types_std_c++0x.cc: Validate that the u8string typedef is defined. * testsuite/22_locale/locale/cons/unicode.cc: Validate the presence of the std::codecvt and std::codecvt facets. * testsuite/29_atomics/atomic/cons/assign_neg.cc: Update line numbers. * testsuite/29_atomics/atomic/cons/copy_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/cons/assign_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/cons/copy_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/is_always_lock_free.cc: Validate std::atomic::is_always_lock_free * testsuite/29_atomics/atomic_integral/operators/bitwise_neg.cc: Update line numbers. * testsuite/29_atomics/atomic_integral/operators/decrement_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/operators/increment_neg.cc: Likewise. * testsuite/29_atomics/headers/atomic/macros.cc: Validate ATOMIC_CHAR8_T_LOCK_FREE and added a missing error message for ATOMIC_CHAR16_T_LOCK_FREE. * testsuite/29_atomics/headers/atomic/types_std_c++0x.cc: Validate std::atomic_char8_t. * testsuite/29_atomics/headers/atomic/types_std_c++0x_neg.cc: Validate atomic_char8_t. * testsuite/experimental/string_view/typedefs.cc: Validate that the u8string_view typedef is defined. * testsuite/util/testsuite_common_types.h (integral_types, integral_types_gnu, atomic_integrals_no_bool, atomic_integrals): Add char8_t to the typelist chains of integral types. Tom. diff --git a/libstdc++-v3/testsuite/18_support/byte/ops.cc b/libstdc++-v3/testsuite/18_support/byte/ops.cc index 6f2755eb0a5..dfbaa8b2efa 100644 --- a/libstdc++-v3/testsuite/18_support/byte/ops.cc +++ b/libstdc++-v3/testsuite/18_support/byte/ops.cc @@ -15,7 +15,7 @@ // with this library; see the file COPYING3. If not see // <http://www.gnu.org/licenses/>. -// { dg-options "-std=gnu++17" } +// { dg-options "-std=gnu++17 -fchar8_t" } // { dg-do compile { target c++17 } } #include @@ -218,7 +218,13 @@ constexpr bool test_to_integer(unsigned char c) static_assert( test_to_integer(0) ); static_assert( test_to_integer(255) ); +static_assert( test_to_integer(0) ); static_assert( test_to_integer(255) ); static_assert( test_to_integer(0) ); static_assert( test_to_integer(255) ); - +static_assert( test_to_integer(0) ); +static_assert( test_to_integer(255) ); +static_assert( test_to_integer(0) ); +static_assert( test_to_integer(255) ); +static_assert( test_to_integer(0) ); +static_assert( test_to_integer(255) ); diff --git a/libstdc++-v3/testsuite/18_support/numeric_limits/dr559.cc b/libstdc++-v3/testsuite/18_support/numeric_limits/dr559.cc index 150db958807..f72b265dc77 100644 --- a/libstdc++-v3/testsuite/18_support/numeric_limits/dr559.cc +++ b/libstdc++-v3/testsuite/18_support/numeric_limits/dr559.cc @@ -1,4 +1,5 @@ // { dg-do run { target c++11 } } +// { dg-options "-fchar8_t" } // 2010-02-17 Paolo Carlini // @@ -84,6 +85,9 @@ int main() do_test(); do_test(); do_test(); +#ifdef _GLIBC
[PATCH 7/9]: C++ P0482R5 char8_t: New standard library tests
This patch adds new tests for char8_t standard library features. Most of these tests were cloned from existing tests that exercise char16_t and adapted for char8_t. Only testsuite/experimental/feat-char8_t.cc and testsuite/ext/char8_t/atomic-1.cc are net new tests. libstdc++-v3/ChangeLog: 2018-11-04 Tom Honermann * testsuite/18_support/numeric_limits/char8_t.cc: New test cloned from char16_32_t.cc; validates numeric_limits. * testsuite/21_strings/basic_string/literals/types-char8_t.cc: New test cloned from types.cc; validates operator""s for char8_t returns u8string. * testsuite/21_strings/basic_string/literals/values-char8_t.cc: New test cloned from values.cc; validates construction and comparison of u8string values. * testsuite/21_strings/basic_string/requirements/ /explicit_instantiation/char8_t/1.cc: New test cloned from char16_t/1.cc; validates explicit instantiation of basic_string. * testsuite/21_strings/basic_string_view/literals/types-char8_t.cc: New test cloned from types.cc; validates operator""sv for char8_t returns u8string_view. * testsuite/21_strings/basic_string_view/literals/ values-char8_t.cc: New test cloned from values.cc; validates construction and comparison of u8string_view values. * testsuite/21_strings/basic_string_view/requirements/ explicit_instantiation/char8_t/1.cc: New test cloned from char16_t/1.cc; validates explicit instantiation of basic_string_view. * testsuite/21_strings/char_traits/requirements/char8_t/65049.cc: New test cloned from char16_t/65049.cc; validates that char_traits is not vulnerable to the concerns in PR65049. * testsuite/21_strings/char_traits/requirements/char8_t/ typedefs.cc: New test cloned from char16_t/typedefs.cc; validates that char_traits member typedefs are present and correct. * testsuite/21_strings/char_traits/requirements/ explicit_instantiation/char8_t/1.cc: New test cloned from char16_t/1.cc; validates explicit instantiation of char_traits. * testsuite/22_locale/codecvt/char16_t-char8_t.cc: New test cloned from char16_t.cc: validates codecvt. * testsuite/22_locale/codecvt/char32_t-char8_t.cc: New test cloned from char32_t.cc: validates codecvt. * testsuite/22_locale/codecvt/utf8-char8_t.cc: New test cloned from utf8.cc; validates codecvt and codecvt. * testsuite/27_io/filesystem/path/native/string-char8_t.cc: New test cloned from string.cc; validates filesystem::path construction from char8_t input. * testsuite/experimental/feat-char8_t.cc: New test; validates that the __cpp_lib_char8_t feature test macro is defined with the correct value. * testsuite/experimental/filesystem/path/native/string-char8_t.cc: New test cloned from string.cc; validates filesystem::path construction from char8_t input. * testsuite/experimental/string_view/literals/types-char8_t.cc: New test cloned from types.cc; validates operator""sv for char8_t returns u8string_view. * testsuite/experimental/string_view/literals/values-char8_t.cc: New test cloned from values.cc; validates construction and comparison of u8string_view values. * testsuite/experimental/string_view/requirements/ explicit_instantiation/char8_t/1.cc: New test cloned from char16_t/1.cc; validates explicit instantiation of basic_string_view. * testsuite/ext/char8_t/atomic-1.cc: New test; validates that ATOMIC_CHAR8_T_LOCK_FREE is not defined if char8_t support is not enabled. Tom. diff --git a/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc b/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc new file mode 100644 index 000..346463d7244 --- /dev/null +++ b/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc @@ -0,0 +1,71 @@ +// { dg-do run { target c++11 } } +// { dg-require-cstdint "" } +// { dg-options "-fchar8_t" } + +// Copyright (C) 2017 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// <http://www.gnu.org/licenses/>. + +#include +#include +#include + +// Test specializations for char8_t. +template + void + do_test() + { +
[PATCH 9/9]: C++ P0482R5 char8_t: Updates to gdb pretty printing support
This patch adds recognition of the u8string and u8string_view type aliases to the gdb pretty printer extension. libstdc++-v3/ChangeLog: 2018-11-04 Tom Honermann * python/libstdcxx/v6/printers.py (register_type_printers): Add type printers for u8string and u8string_view. * testsuite/libstdc++-prettyprinters/whatis.cc: Validate recognition of u8string. Tom. diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py index 827c87b70ea..f9e638e210d 100644 --- a/libstdc++-v3/python/libstdcxx/v6/printers.py +++ b/libstdc++-v3/python/libstdcxx/v6/printers.py @@ -1554,7 +1554,7 @@ def register_type_printers(obj): return # Add type printers for typedefs std::string, std::wstring etc. -for ch in ('', 'w', 'u16', 'u32'): +for ch in ('', 'w', 'u8', 'u16', 'u32'): add_one_type_printer(obj, 'basic_string', ch + 'string') add_one_type_printer(obj, '__cxx11::basic_string', ch + 'string') # Typedefs for __cxx11::basic_string used to be in namespace __cxx11: @@ -1604,7 +1604,7 @@ def register_type_printers(obj): # Add type printers for experimental::basic_string_view typedefs. ns = 'experimental::fundamentals_v1::' -for ch in ('', 'w', 'u16', 'u32'): +for ch in ('', 'w', 'u8', 'u16', 'u32'): add_one_type_printer(obj, ns + 'basic_string_view', ns + ch + 'string_view') diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc b/libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc index 90f3994314b..d74bf7c5e9b 100644 --- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc +++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc @@ -1,5 +1,5 @@ // { dg-do run { target c++11 } } -// { dg-options "-g -O0" } +// { dg-options "-g -O0 -fchar8_t" } // { dg-skip-if "" { *-*-* } { "-D_GLIBCXX_PROFILE" } } // Copyright (C) 2011-2018 Free Software Foundation, Inc. @@ -130,6 +130,9 @@ holder cregex_token_iterator_holder; std::sregex_token_iterator *sregex_token_iterator_ptr; holder sregex_token_iterator_holder; // { dg-final { whatis-test sregex_token_iterator_holder "holder" } } +std::u8string *u8string_ptr; +holder u8string_holder; +// { dg-final { whatis-test u8string_holder "holder" } } std::u16string *u16string_ptr; holder u16string_holder; // { dg-final { whatis-test u16string_holder "holder" } } @@ -240,6 +243,8 @@ main() placeholder(&cregex_token_iterator_holder); placeholder(&sregex_token_iterator_ptr); placeholder(&sregex_token_iterator_holder); + placeholder(&u8string_ptr); + placeholder(&u8string_holder); placeholder(&u16string_ptr); placeholder(&u16string_holder); placeholder(&u32string_ptr);
Re: [REVISED PATCH 2/9]: C++ P0482R5 char8_t: Core language support
On 1/14/19 2:58 PM, Jason Merrill wrote: On 12/23/18 9:27 PM, Tom Honermann wrote: Attached is a revised patch that addresses changes in P0482R6 as well as feedback provided by Jason. Changes from the prior patch include: - Updated the value of the __cpp_char8_t feature test macro to 201811 per P0482R6. - Enable char8_t support with -std=c++2a per adoption of P0482R6 in San Diego. - Reverted the unnecessary changes to gcc/gcc/c/c-typeck.c as requested by Jason. - Removed unnecessary checks of 'flag_char8_t' within the C++ front end as requested by Jason. - Corrected the regression spotted by Jason regarding initialization of signed char and unsigned char arrays with string literals. - Made minor changes to the error message emitted for ill-formed initialization of char arrays with UTF-8 string literals. These changes do not yet implement Jason's suggestion; I'll follow up with a separate patch for that due to additional test impact. Tested on x86_64-linux. I just applied the compiler changes with small modifications, as follows; thank you very much for the patches. Jonathan should check in the library portion before long. Excellent, thank you, Jason! Tom. Jason
Re: PATCH: Updated error messages for ill-formed cases of array initialization by string literal
On 1/4/19 7:25 PM, Martin Sebor wrote: On 12/27/18 1:49 PM, Tom Honermann wrote: As requested by Jason in the review of the P0482 (char8_t) core language changes, this patch includes updates to the error messages emitted for ill-formed cases of array initialization with a string literal. With these changes, error messages that previously looked something like these: - "char-array initialized from wide string" - "wide character array initialized from non-wide string" - "wide character array initialized from incompatible wide string" now look like: - "cannot initialize array of type 'char' from a string literal with type array of 'short unsigned int'" The first word "type" doesn't quite work here. The type of every array is "array of T" where T is the type of the element, so for instance, "array of char." Saying "array of type X" makes it sound like X is the type of the whole array, which is of course not the case when X is char. I think you want to use the same wording as for the second type: "cannot initialize array of 'char' from a string literal with type array of 'short unsigned int'" or perhaps even better "cannot initialize array of 'char' from a string literal with type 'char16_t[N]'" (i.e., show the actual type of the string, including its bound). Thank you for the feedback, Martin; sorry for the delayed response. I'll follow up with a revised patch within the next week or two. Tom. Martin
Re: [REVISED PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates
On 1/4/19 7:40 PM, Martin Sebor wrote: On 12/23/18 7:27 PM, Tom Honermann wrote: Attached is a revised patch that addresses feedback provided by Jason and Sandra. Changes from the prior patch include: - Updates to the -fchar8_t option documentation as requested by Jason. - Corrections for indentation, spacing, hyphenation, and wrapping as requested by Sandra. Just a minor nit that backticks in code examples should be avoided (per the TexInfo manual, they can cause trouble when copying code from PDF readers): +@smallexample +char ca[] = u8"xx"; // error: char-array initialized from wide + // string +const char *cp = u8"xx";// error: invalid conversion from + // `const char8_t*' to `const char*' Thanks for catching that, Martin. Patch relative to trunk (r267930) attached to correct this (Jason already committed the original change). Tom. Martin Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 267930) +++ gcc/doc/invoke.texi (working copy) @@ -2468,16 +2468,16 @@ char ca[] = u8"xx"; // error: char-array initialized from wide //string const char *cp = u8"xx";// error: invalid conversion from -//`const char8_t*' to `const char*' +//'const char8_t*' to 'const char*' int f(const char*); auto v = f(u8"xx"); // error: invalid conversion from -//`const char8_t*' to `const char*' +//'const char8_t*' to 'const char*' std::string s@{u8"xx"@}; // error: no matching function for call to -//`std::basic_string::basic_string()' +//'std::basic_string::basic_string()' using namespace std::literals; s = u8"xx"s;// error: conversion from -//`basic_string' to non-scalar -//type `basic_string' requested +//'basic_string' to non-scalar +//type 'basic_string' requested @end smallexample @item -fcheck-new
Re: [REVISED PATCH 2/9]: C++ P0482R5 char8_t: Core language support
On 1/15/19 1:51 AM, Christophe Lyon wrote: On Mon, 14 Jan 2019 at 20:59, Jason Merrill wrote: On 12/23/18 9:27 PM, Tom Honermann wrote: Attached is a revised patch that addresses changes in P0482R6 as well as feedback provided by Jason. Changes from the prior patch include: - Updated the value of the __cpp_char8_t feature test macro to 201811 per P0482R6. - Enable char8_t support with -std=c++2a per adoption of P0482R6 in San Diego. - Reverted the unnecessary changes to gcc/gcc/c/c-typeck.c as requested by Jason. - Removed unnecessary checks of 'flag_char8_t' within the C++ front end as requested by Jason. - Corrected the regression spotted by Jason regarding initialization of signed char and unsigned char arrays with string literals. - Made minor changes to the error message emitted for ill-formed initialization of char arrays with UTF-8 string literals. These changes do not yet implement Jason's suggestion; I'll follow up with a separate patch for that due to additional test impact. Tested on x86_64-linux. I just applied the compiler changes with small modifications, as follows; thank you very much for the patches. Jonathan should check in the library portion before long. Jason Hi, The new testcase g++.dg/ext/utf-cvt-char8_t.C fails at least on arm and aarch64: g++.dg/ext/utf-cvt-char8_t.C -std=gnu++14 (test for warnings, line 24) g++.dg/ext/utf-cvt-char8_t.C -std=gnu++17 (test for warnings, line 24) Arm and aarch64 have unsigned char by default, so the warning ("conversion to 'char' from 'char8_t' may change the sign of the result") isn't emitted on those platforms. I presume adding '-fsigned-char' to the options for the test would be a sufficient fix? If so, a patch is attached. Tom. Christophe Index: gcc/testsuite/g++.dg/ext/utf-cvt-char8_t.C === --- gcc/testsuite/g++.dg/ext/utf-cvt-char8_t.C (revision 267930) +++ gcc/testsuite/g++.dg/ext/utf-cvt-char8_t.C (working copy) @@ -1,7 +1,7 @@ /* Contributed by Kris Van Hees */ /* Test the char8_t promotion rules. */ /* { dg-do compile { target c++11 } } */ -/* { dg-options "-fchar8_t -Wall -Wconversion -Wsign-conversion -Wsign-promo" } */ +/* { dg-options "-fchar8_t -fsigned-char -Wall -Wconversion -Wsign-conversion -Wsign-promo" } */ extern void f_c (char); extern void fsc (signed char);
Re: [REVISED PATCH 5/9]: C++ P0482R5 char8_t: Standard library support
On 2/7/19 4:44 AM, Jonathan Wakely wrote: On 23/12/18 21:27 -0500, Tom Honermann wrote: Attached is a revised patch that addresses changes in P0482R6. Changes from the prior patch include: - Updated the value of the __cpp_char8_t feature test macro to 201811. Tested on x86_64-linux. Thanks, Tom, this is great work! The front-end changes for char8_t went in recently, and I'm finally ready to commit the library parts. Great! There's one big problem I found in this patch, which is that the new numeric_limits specialization uses constexpr unconditionally. That fails if is compiled using options like -std=c++98 -fno-char8_t because the specialization will be used, but the constexpr keyword isn't allowed. That's easily fixed by replacing the keyword with _GLIBCXX_CONSTEXPR. Hmm, the code for the char8_t specialization was copied from the char16_t specialization which also uses constexpr unconditionally (but is guarded by a C++11+ requirement). The char8_t specialization must be elided when the compiler is invoked with -std=c++98 -fno-char8_t (since the char8_t type doesn't exist then). The _GLIBCXX_USE_CHAR8_T guard doesn't suffice for this? _GLIBCXX_USE_CHAR8_T should only be defined if __cpp_char8_t is defined; and that should only be defined if -fchar8_t or -std=c++2a is specified. Or perhaps you intended -std=c++98 -fchar8_t? I agree in that case that use of _GLIBCXX_CONSTEXPR is necessary. The other way to solve that problem would be for the compiler to give an error if -fchar8_t is used with C++98, but I see no fundamental reason that combination of options shouldn't be allowed. We can support it in the library by using the macro. Agreed. As discussed in San Diego, the other change needed is to add the abi_tag attribute to the new versions of path::u8string and path::generic_u8string, so that the mangling is different when its return type is different: #ifdef _GLIBCXX_USE_CHAR8_T __attribute__((__abi_tag__("__u8"))) std::u8string u8string() const; #else std::string u8string() const; #endif // _GLIBCXX_USE_CHAR8_T Otherwise we get ODR violations when linking objects compiled with -fchar8_t enabled to objects with it disabled (e.g. linking -std=c++17 objects to -std=c++2a objects, which needs to work). Are ODR violations bad? :) I suggest "__u8" as the name of the ABI tag, but I'm open to other suggestions. "__char8_t" is a bit long and verbose. "__cxx20" would be consistent with "__cxx11" used for the new ABI introduced in GCC 5 but it regularly confuses people who think it is coupled to the -std=c++11 option (and so don't understand why they still see it for -std=c++14). I have no preference or alternative suggestions here. Had I recognized the issue, I would have asked you what to do about it :) Also, I see that you've made changes to (to add the experimental::u8string_view typedef) and to std::experimental::path (to change the return type of u8string and generic_u8string). The former change is fairly harmless; it only adds a typedef, albeit one which is not a reserved name in C++14/C++17 and so should be available for users to define as a macro. Maybe prior to C++2a we should only define it when GNU extensions are enabled (i.e. when using -std=gnu++14 not -std=c++14): #if defined _GLIBCXX_USE_CHAR8_T \ && (__cplusplus > 201703L || !defined __STRICT_ANSI__) using u8string_view = basic_string_view; #endif That makes sense. Changing the return type of experimental::path members concerns me more. That's a published TS which is not going to be revised, and it's not obvious to me that users would want the change in semantics. If somebody is still using the Filesystem TS in C++2a code, they're probably not expecting it to change. If they need to update their code for C++2a they might as well just use std::filesystem, and so having char8_t support in std::experimental::filesystem isn't clearly useful. I agree. I added the support to the experimental implementations more out of a desire to be complete and to remove any potential barriers to use of -fchar8_t than because I felt the changes were really necessary. I would be perfectly fine with skipping the updates to the experimental libraries completely. Tom.
Re: [REVISED PATCH 7/9]: C++ P0482R5 char8_t: New standard library tests
On 2/7/19 4:54 AM, Jonathan Wakely wrote: On 23/12/18 21:27 -0500, Tom Honermann wrote: Attached is a revised patch that addresses changes in P0482R6. Changes from the prior patch include: - Updated the value of the __cpp_char8_t feature test macro to 201811. Tested on x86_64-linux. There are quite a few additional changes needed to make the testsuite pass cleanly with non-default options, e.g. when running it with RUNTESTFLAGS=--target_board=unix/-fchar8_t/-fno-inline I see these failures: I remember thinking that I had to deal with this at one point. It seems I then forgot about it. FAIL: 21_strings/basic_string/literals/types.cc (test for excess errors) FAIL: 21_strings/basic_string/literals/values.cc (test for excess errors) UNRESOLVED: 21_strings/basic_string/literals/values.cc compilation failed to produce executable FAIL: 21_strings/basic_string_view/literals/types.cc (test for excess errors) FAIL: 21_strings/basic_string_view/literals/values.cc (test for excess errors) UNRESOLVED: 21_strings/basic_string_view/literals/values.cc compilation failed to produce executable FAIL: 22_locale/codecvt/char16_t.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/char16_t.cc compilation failed to produce executable FAIL: 22_locale/codecvt/char32_t.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/char32_t.cc compilation failed to produce executable FAIL: 22_locale/codecvt/codecvt_utf8/79980.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/codecvt_utf8/79980.cc compilation failed to produce executable FAIL: 22_locale/codecvt/codecvt_utf8/wchar_t/1.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/codecvt_utf8/wchar_t/1.cc compilation failed to produce executable FAIL: 22_locale/codecvt/utf8.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/utf8.cc compilation failed to produce executable FAIL: 22_locale/conversions/string/2.cc (test for excess errors) UNRESOLVED: 22_locale/conversions/string/2.cc compilation failed to produce executable FAIL: 22_locale/conversions/string/3.cc (test for excess errors) UNRESOLVED: 22_locale/conversions/string/3.cc compilation failed to produce executable FAIL: experimental/string_view/literals/types.cc (test for excess errors) FAIL: experimental/string_view/literals/values.cc (test for excess errors) UNRESOLVED: experimental/string_view/literals/values.cc compilation failed to produce executable There would be similar errors running all the tests with -std=c++2a, which is definitely something I do often and so want the tests to be clean. Absolutely, agreed. We can either disable those tests when char8_t is enabled (because we already have alternative tests checking the char8_t versions of string_view etc.) or make them work either way, which the attached patch begins doing (more changes are needed). Since most of these tests exercise functionality that is not u8/char8_t specific, I think we should make them work. I expect a different set of failures for -fno-char8_t (which is probably a less important case to support that enabling char8_t in older standards, but maybe still worth testing now and then). I'm not sure it is less important. -fno-char8_t may be an important tool for some code bases during their initial testing of, and migration to, C++20. Tom.
[PATCH 0/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics
This change addresses the following issue raised on the libc-alpha mailing list: https://sourceware.org/pipermail/libc-alpha/2022-July/140825.html Glibc 2.36 adds a char8_t typedef in C++ modes that do not enable the char8_t builtin type (C++17 and earlier by default; subject to _GNU_SOURCE and use of the -f[no-]char8_t option). When -Wc++20-compat diagnostics are enabled, the following warning is issued from the glibc uchar.h header. warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat] Such diagnostics are not desired from system headers, so glibc would like to suppress the diagnostic using '#pragma GCC diagnostic ignored "-Wc++20-compat"', but attempting to do so currently fails. This patch corrects that. Tom Honermann (1): c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics. gcc/c-family/c-opts.cc | 7 +++ gcc/c-family/c.opt | 2 +- gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 + libcpp/include/cpplib.h| 4 libcpp/init.cc | 1 + 6 files changed, 42 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C -- 2.32.0
[PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.
Gcc's '#pragma GCC diagnostic' directives are processed in "early mode" (see handle_pragma_diagnostic_early) for the C++ frontend and, as such, require that the target diagnostic option be enabled for the preprocessor (see c_option_is_from_cpp_diagnostics). This change modifies the -Wc++20-compat option definition to register it as a preprocessor option so that its associated diagnostics can be suppressed. The changes also implicitly disable the option in C++20 and later modes. These changes are consistent with the definition of the -Wc++11-compat option. This support is motivated by the need to suppress the following diagnostic otherwise issued in C++17 and earlier modes due to the char8_t typedef present in the uchar.h header file in glibc 2.36. warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat] Tests are added to validate suppression of both -Wc++11-compat and -Wc++20-compat related diagnostics (fixes were only needed for the C++20 case). Fixes https://gcc.gnu.org/PR106423. gcc/c-family/ChangeLog: * c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics in C++20 and later. * c.opt (Wc++20-compat): Enable hooks for the preprocessor. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/keywords2.C: New test. * g++.dg/cpp2a/keywords2.C: New test. libcpp/ChangeLog: * include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT. * init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat. --- gcc/c-family/c-opts.cc | 7 +++ gcc/c-family/c.opt | 2 +- gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 + libcpp/include/cpplib.h| 4 libcpp/init.cc | 1 + 6 files changed, 42 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index b9f01a65ed7..1ea37ba9742 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename) else if (warn_narrowing == -1) warn_narrowing = 0; + if (cxx_dialect >= cxx20) +{ + /* Don't warn about C++20 compatibility changes in C++20 or later. */ + warn_cxx20_compat = 0; + cpp_opts->cpp_warn_cxx20_compat = 0; +} + /* C++17 has stricter evaluation order requirements; let's use some of them for earlier C++ as well, so chaining works as expected. */ if (c_dialect_cxx () diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 44e1a60ce24..dfdebd596ef 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -455,7 +455,7 @@ Wc++2a-compat C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented Wc++20-compat -C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) +C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT) Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO C++ 2020. Wc++11-extensions diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C b/gcc/testsuite/g++.dg/cpp0x/keywords2.C new file mode 100644 index 000..d67d01e31ed --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C @@ -0,0 +1,16 @@ +// { dg-do compile { target c++98_only } } +// { dg-options "-Wc++11-compat" } + +// Validate suppression of -Wc++11-compat diagnostics. +#pragma GCC diagnostic ignored "-Wc++11-compat" +int alignof; +int alignas; +int constexpr; +int decltype; +int noexcept; +int nullptr; +int static_assert; +int thread_local; +int _Alignas; +int _Alignof; +int _Thread_local; diff --git a/gcc/testsuite/g++.dg/cpp2a/keywords2.C b/gcc/testsuite/g++.dg/cpp2a/keywords2.C new file mode 100644 index 000..8714a7b26b7 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/keywords2.C @@ -0,0 +1,13 @@ +// { dg-do compile { target c++17_down } } +// { dg-options "-Wc++20-compat" } + +// Validate suppression of -Wc++20-compat diagnostics. +#pragma GCC diagnostic ignored "-Wc++20-compat" +int constinit; +int consteval; +int requires; +int concept; +int co_await; +int co_yield; +int co_return; +int char8_t; diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index 3eba6f74b57..9d90c18e4f2 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -547,6 +547,9 @@ struct cpp_options /* True if warn about differences between C++98 and C++11. */ bool cpp_warn_cxx11_compat; + /* True if warn about differences between C++17 and C++20. */ + bool cpp_warn_cxx20_compat; + /* Nonzero if bidirectional control characters checking is on. See enum cpp_bidirectional_level. */ unsigned char cpp_warn_bidirectional; @@ -655,6 +658,7 @@ enum cpp_warning_reason { CPP_W_C90_C99_COMPAT, CPP_W_C11_C2X_COMPAT, CPP_W_CXX11_COMPAT, +
[PATCH 0/3] Implement C2X N2653 (char8_t) and correct UTF-8 character literal type in preprocessor directives for C++
This patch series provides an implementation and tests for the WG14 N2653 paper as adopted for C2X. Additionally, a fix is included for the C++ preprocessor to treat UTF-8 character literals in preprocessor directives as an unsigned type in char8_t enabled modes (in C++17 and earlier with -fchar8_t or in C++20 or later without -fno-char8_t). Tom Honermann (3): C: Implement C2X N2653 char8_t and UTF-8 string literal changes testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes c++/106426: Treat u8 character literals as unsigned in char8_t modes. gcc/c-family/c-lex.cc | 13 -- gcc/c-family/c-opts.cc| 5 ++- gcc/c/c-parser.cc | 16 ++- gcc/c/c-typeck.cc | 2 +- gcc/ginclude/stdatomic.h | 8 .../g++.dg/ext/char8_t-char-literal-1.C | 6 ++- .../g++.dg/ext/char8_t-char-literal-2.C | 4 ++ .../atomic/c2x-stdatomic-lockfree-char8_t.c | 42 +++ .../atomic/gnu2x-stdatomic-lockfree-char8_t.c | 5 +++ gcc/testsuite/gcc.dg/c2x-predefined-macros.c | 11 + gcc/testsuite/gcc.dg/c2x-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++ .../gcc.dg/gnu2x-predefined-macros.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str.c | 34 +++ libcpp/charset.cc | 4 +- libcpp/include/cpplib.h | 4 +- libcpp/init.cc| 1 + 18 files changed, 191 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/c2x-predefined-macros.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c -- 2.32.0
[PATCH 3/3] c++/106426: Treat u8 character literals as unsigned in char8_t modes.
This patch corrects handling of UTF-8 character literals in preprocessing directives so that they are treated as unsigned types in char8_t enabled C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously, UTF-8 character literals were always treated as having the same type as ordinary character literals (signed or unsigned dependent on target or use of the -fsigned-char or -funsigned char options). Fixes https://gcc.gnu.org/PR106426. gcc/c-family/ChangeLog: * c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char subject to -fchar8_t, -fsigned-char, and/or -funsigned-char. gcc/testsuite/ChangeLog: * g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals. * g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals. libcpp/ChangeLog: * charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR literals based on unsigned_utf8char. * include/cpplib.h (cpp_options): Add unsigned_utf8char. * init.cc (cpp_create_reader): Initialize unsigned_utf8char. --- gcc/c-family/c-opts.cc| 1 + gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C | 6 +- gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C | 4 libcpp/charset.cc | 4 ++-- libcpp/include/cpplib.h | 4 ++-- libcpp/init.cc| 1 + 6 files changed, 15 insertions(+), 5 deletions(-) diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index 108adc5caf8..02ce1e86cdb 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1062,6 +1062,7 @@ c_common_post_options (const char **pfilename) /* char8_t support is implicitly enabled in C++20 and C2X. */ if (flag_char8_t == -1) flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x; + cpp_opts->unsigned_utf8char = flag_char8_t ? 1 : cpp_opts->unsigned_char; if (flag_extern_tls_init) { diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C index 8ed85ccfdcd..2994dd38516 100644 --- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C +++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C @@ -1,6 +1,6 @@ // Test that UTF-8 character literals have type char if -fchar8_t is not enabled. // { dg-do compile } -// { dg-options "-std=c++17 -fno-char8_t" } +// { dg-options "-std=c++17 -fsigned-char -fno-char8_t" } template struct is_same @@ -10,3 +10,7 @@ template { static const bool value = true; }; static_assert(is_same::value, "Error"); + +#if u8'\0' - 1 > 0 +#error "UTF-8 character literals not signed in preprocessor" +#endif diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C index 7861736689c..db4fe70046d 100644 --- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C +++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C @@ -10,3 +10,7 @@ template { static const bool value = true; }; static_assert(is_same::value, "Error"); + +#if u8'\0' - 1 < 0 +#error "UTF-8 character literals not unsigned in preprocessor" +#endif diff --git a/libcpp/charset.cc b/libcpp/charset.cc index ca8b7cf7aa5..12e31632228 100644 --- a/libcpp/charset.cc +++ b/libcpp/charset.cc @@ -1960,8 +1960,8 @@ narrow_str_to_charconst (cpp_reader *pfile, cpp_string str, /* Multichar constants are of type int and therefore signed. */ if (i > 1) unsigned_p = 0; - else if (type == CPP_UTF8CHAR && !CPP_OPTION (pfile, cplusplus)) -unsigned_p = 1; + else if (type == CPP_UTF8CHAR) +unsigned_p = CPP_OPTION (pfile, unsigned_utf8char); else unsigned_p = CPP_OPTION (pfile, unsigned_char); diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index 3eba6f74b57..f9c042db034 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -581,8 +581,8 @@ struct cpp_options ints and target wide characters, respectively. */ size_t precision, char_precision, int_precision, wchar_precision; - /* True means chars (wide chars) are unsigned. */ - bool unsigned_char, unsigned_wchar; + /* True means chars (wide chars, UTF-8 chars) are unsigned. */ + bool unsigned_char, unsigned_wchar, unsigned_utf8char; /* True if the most significant byte in a word has the lowest address in memory. */ diff --git a/libcpp/init.cc b/libcpp/init.cc index f4ab83d2145..0242da5f55c 100644 --- a/libcpp/init.cc +++ b/libcpp/init.cc @@ -231,6 +231,7 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table, CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int); CPP_OPTION (pfile, unsigned_char) = 0; CPP_OPTION (pfile, unsigned_wchar) = 1; + CPP_OPTION (pfile, unsigned_utf8char) = 1; CPP_OPTION (pfile, bytes_big_endian) = 1; /* does not matter */ /* Default to no charset conversion. */ -- 2.32.0
[PATCH 1/3] C: Implement C2X N2653 char8_t and UTF-8 string literal changes
This patch implements the core language and compiler dependent library changes adopted for C2X via WG14 N2653. The changes include: - Change of type for UTF-8 string literals from array of const char to array of const char8_t (unsigned char). - A new atomic_char8_t typedef. - A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro. gcc/ChangeLog: * ginclude/stdatomic.h (atomic_char8_t, ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro. gcc/c/ChangeLog: * c-parser.c (c_parser_string_literal): Use char8_t as the type of CPP_UTF8STRING when char8_t support is enabled. * c-typeck.c (digest_init): Allow initialization of an array of character type by a string literal with type array of char8_t. gcc/c-family/ChangeLog: * c-lex.c (lex_string, lex_charconst): Use char8_t as the type of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is enabled. * c-opts.c (c_common_post_options): Set flag_char8_t if targeting C2x. --- gcc/c-family/c-lex.cc| 13 + gcc/c-family/c-opts.cc | 4 ++-- gcc/c/c-parser.cc| 16 ++-- gcc/c/c-typeck.cc| 2 +- gcc/ginclude/stdatomic.h | 8 5 files changed, 34 insertions(+), 9 deletions(-) diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc index 8bfa4f4024f..0b6f94e18a8 100644 --- a/gcc/c-family/c-lex.cc +++ b/gcc/c-family/c-lex.cc @@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate) default: case CPP_STRING: case CPP_UTF8STRING: - value = build_string (1, ""); + if (type == CPP_UTF8STRING && flag_char8_t) + { + value = build_string (TYPE_PRECISION (char8_type_node) + / TYPE_PRECISION (char_type_node), + ""); /* char8_t is 8 bits */ + } + else + value = build_string (1, ""); break; case CPP_STRING16: value = build_string (TYPE_PRECISION (char16_type_node) @@ -1425,9 +1432,7 @@ lex_charconst (const cpp_token *token) type = char16_type_node; else if (token->type == CPP_UTF8CHAR) { - if (!c_dialect_cxx ()) - type = unsigned_char_type_node; - else if (flag_char8_t) + if (flag_char8_t) type = char8_type_node; else type = char_type_node; diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index b9f01a65ed7..108adc5caf8 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1059,9 +1059,9 @@ c_common_post_options (const char **pfilename) if (flag_sized_deallocation == -1) flag_sized_deallocation = (cxx_dialect >= cxx14); - /* char8_t support is new in C++20. */ + /* char8_t support is implicitly enabled in C++20 and C2X. */ if (flag_char8_t == -1) -flag_char8_t = (cxx_dialect >= cxx20); +flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x; if (flag_extern_tls_init) { diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 92049d1a101..fa9395986de 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -7447,7 +7447,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok) default: case CPP_STRING: case CPP_UTF8STRING: - value = build_string (1, ""); + if (type == CPP_UTF8STRING && flag_char8_t) + { + value = build_string (TYPE_PRECISION (char8_type_node) + / TYPE_PRECISION (char_type_node), + ""); /* char8_t is 8 bits */ + } + else + value = build_string (1, ""); break; case CPP_STRING16: value = build_string (TYPE_PRECISION (char16_type_node) @@ -7472,9 +7479,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok) { default: case CPP_STRING: -case CPP_UTF8STRING: TREE_TYPE (value) = char_array_type_node; break; +case CPP_UTF8STRING: + if (flag_char8_t) + TREE_TYPE (value) = char8_array_type_node; + else + TREE_TYPE (value) = char_array_type_node; + break; case CPP_STRING16: TREE_TYPE (value) = char16_array_type_node; break; diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index fd0a7f81a7a..231f4e980b6 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -8045,7 +8045,7 @@ digest_init (location_t init_loc, tree type, tree init, tree origtype, if (char_array) { - if (typ2 != char_type_node) + if (typ2 != char_type_node && typ2 != char8_type_node) incompat_string_cst = true; } else if (!comptypes (typ1, typ2)) diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h index bfcfdf664c7..75ed7965689 100644
[PATCH 2/3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes
This change provides new tests for the core language and compiler dependent library changes adopted for C2X via WG14 N2653. gcc/testsuite/ChangeLog: * gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/c2x-predefined-macros.c: New test. * gcc.dg/c2x-utf8str-type.c: New test. * gcc.dg/c2x-utf8str.c: New test. * gcc.dg/gnu2x-predefined-macros.c: New test. * gcc.dg/gnu2x-utf8str-type.c: New test. * gcc.dg/gnu2x-utf8str.c: New test. --- .../atomic/c2x-stdatomic-lockfree-char8_t.c | 42 +++ .../atomic/gnu2x-stdatomic-lockfree-char8_t.c | 5 +++ gcc/testsuite/gcc.dg/c2x-predefined-macros.c | 11 + gcc/testsuite/gcc.dg/c2x-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++ .../gcc.dg/gnu2x-predefined-macros.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str.c | 34 +++ 8 files changed, 142 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/c2x-predefined-macros.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..37ea4c8926c --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,42 @@ +/* Test atomic_is_lock_free for char8_t. */ +/* { dg-do run } */ +/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */ + +#include +#include + +extern void abort (void); + +_Atomic __CHAR8_TYPE__ ac8a; +atomic_char8_t ac8t; + +#define CHECK_TYPE(MACRO, V1, V2) \ + do \ +{ \ + int r1 = MACRO; \ + int r2 = atomic_is_lock_free (&V1); \ + int r3 = atomic_is_lock_free (&V2); \ + if (r1 != 0 && r1 != 1 && r1 != 2) \ + abort (); \ + if (r2 != 0 && r2 != 1) \ + abort (); \ + if (r3 != 0 && r3 != 1) \ + abort (); \ + if (r1 == 2 && r2 != 1) \ + abort (); \ + if (r1 == 2 && r3 != 1) \ + abort (); \ + if (r1 == 0 && r2 != 0) \ + abort (); \ + if (r1 == 0 && r3 != 0) \ + abort (); \ +} \ + while (0) + +int +main () +{ + CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..a017b134817 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,5 @@ +/* Test atomic_is_lock_free for char8_t with -std=gnu2x. */ +/* { dg-do run } */ +/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */ + +#include "c2x-stdatomic-lockfree-char8_t.c" diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c new file mode 100644 index 000..3456105563a --- /dev/null +++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c @@ -0,0 +1,11 @@ +/* Test C2X predefined macros. */ +/* { dg-do compile } */ +/* { dg-options "-std=c2x" } */ + +#if !defined(__CHAR8_TYPE__) +# error __CHAR8_TYPE__ is not defined! +#endif + +#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE) +# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined! +#endif diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str-type.c b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c new file mode 100644 index 000..1ae86955516 --- /dev/null +++ b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c @@ -0,0 +1,6 @@ +/* Test C2X UTF-8 string literal type. */ +/* { dg-do compile } */ +/* { dg-options "-std=c2x" } */ + +_Static_assert (_Generic (u8"text", char*: 1, unsigned char*: 2) == 2, "UTF-8 string literals have an unexpected type"); +_Static_assert (_Generic (u8"x"[0], char: 1, unsigned char: 2) == 2, "UTF-8 string literal elements have an unexpected type"); diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str.c b/gcc/testsuite/gcc.dg/c2x-utf
Re: [PATCH 3/3 v2] preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.
This patch corrects handling of UTF-8 character literals in preprocessing directives so that they are treated as unsigned types in char8_t enabled C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously, UTF-8 character literals were always treated as having the same type as ordinary character literals (signed or unsigned dependent on target or use of the -fsigned-char or -funsigned char options). PR preprocessor/106426 gcc/c-family/ChangeLog: * c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char subject to -fchar8_t, -fsigned-char, and/or -funsigned-char. gcc/testsuite/ChangeLog: * g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals. * g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals. libcpp/ChangeLog: * charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR literals based on unsigned_utf8char. * include/cpplib.h (cpp_options): Add unsigned_utf8char. * init.cc (cpp_create_reader): Initialize unsigned_utf8char. --- gcc/c-family/c-opts.cc| 1 + gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C | 6 +- gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C | 4 libcpp/charset.cc | 4 ++-- libcpp/include/cpplib.h | 4 ++-- libcpp/init.cc| 1 + 6 files changed, 15 insertions(+), 5 deletions(-) diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index 108adc5caf8..02ce1e86cdb 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1062,6 +1062,7 @@ c_common_post_options (const char **pfilename) /* char8_t support is implicitly enabled in C++20 and C2X. */ if (flag_char8_t == -1) flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x; + cpp_opts->unsigned_utf8char = flag_char8_t ? 1 : cpp_opts->unsigned_char; if (flag_extern_tls_init) { diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C index 8ed85ccfdcd..2994dd38516 100644 --- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C +++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C @@ -1,6 +1,6 @@ // Test that UTF-8 character literals have type char if -fchar8_t is not enabled. // { dg-do compile } -// { dg-options "-std=c++17 -fno-char8_t" } +// { dg-options "-std=c++17 -fsigned-char -fno-char8_t" } template struct is_same @@ -10,3 +10,7 @@ template { static const bool value = true; }; static_assert(is_same::value, "Error"); + +#if u8'\0' - 1 > 0 +#error "UTF-8 character literals not signed in preprocessor" +#endif diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C index 7861736689c..db4fe70046d 100644 --- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C +++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C @@ -10,3 +10,7 @@ template { static const bool value = true; }; static_assert(is_same::value, "Error"); + +#if u8'\0' - 1 < 0 +#error "UTF-8 character literals not unsigned in preprocessor" +#endif diff --git a/libcpp/charset.cc b/libcpp/charset.cc index ca8b7cf7aa5..12e31632228 100644 --- a/libcpp/charset.cc +++ b/libcpp/charset.cc @@ -1960,8 +1960,8 @@ narrow_str_to_charconst (cpp_reader *pfile, cpp_string str, /* Multichar constants are of type int and therefore signed. */ if (i > 1) unsigned_p = 0; - else if (type == CPP_UTF8CHAR && !CPP_OPTION (pfile, cplusplus)) -unsigned_p = 1; + else if (type == CPP_UTF8CHAR) +unsigned_p = CPP_OPTION (pfile, unsigned_utf8char); else unsigned_p = CPP_OPTION (pfile, unsigned_char); diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index 3eba6f74b57..f9c042db034 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -581,8 +581,8 @@ struct cpp_options ints and target wide characters, respectively. */ size_t precision, char_precision, int_precision, wchar_precision; - /* True means chars (wide chars) are unsigned. */ - bool unsigned_char, unsigned_wchar; + /* True means chars (wide chars, UTF-8 chars) are unsigned. */ + bool unsigned_char, unsigned_wchar, unsigned_utf8char; /* True if the most significant byte in a word has the lowest address in memory. */ diff --git a/libcpp/init.cc b/libcpp/init.cc index f4ab83d2145..0242da5f55c 100644 --- a/libcpp/init.cc +++ b/libcpp/init.cc @@ -231,6 +231,7 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table, CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int); CPP_OPTION (pfile, unsigned_char) = 0; CPP_OPTION (pfile, unsigned_wchar) = 1; + CPP_OPTION (pfile, unsigned_utf8char) = 1; CPP_OPTION (pfile, bytes_big_endian) = 1; /* does not matter */ /* Default to no charset conversion. */ -- 2.32.0
Re: [PATCH 3/3] c++/106426: Treat u8 character literals as unsigned in char8_t modes.
On 7/25/22 2:05 PM, Andrew Pinski wrote: On Mon, Jul 25, 2022 at 11:01 AM Tom Honermann via Gcc-patches wrote: This patch corrects handling of UTF-8 character literals in preprocessing directives so that they are treated as unsigned types in char8_t enabled C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously, UTF-8 character literals were always treated as having the same type as ordinary character literals (signed or unsigned dependent on target or use of the -fsigned-char or -funsigned char options). Fixes https://gcc.gnu.org/PR106426. The above mention of the PR # should just be: preprocessor/106426 And then when this patch gets committed, it will be recorded in bugzilla also. Thank you. I resent the patch with a revised subject line and commit message to reflect the component change in Bugzilla. Tom. Thanks, Andrew Pinski
Re: [PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.
On 7/27/22 7:09 PM, Joseph Myers wrote: On Sun, 24 Jul 2022, Tom Honermann via Gcc-patches wrote: Gcc's '#pragma GCC diagnostic' directives are processed in "early mode" (see handle_pragma_diagnostic_early) for the C++ frontend and, as such, require that the target diagnostic option be enabled for the preprocessor (see c_option_is_from_cpp_diagnostics). This change modifies the -Wc++20-compat option definition to register it as a preprocessor option so that its associated diagnostics can be suppressed. The changes also There are lots of C++ warning options, all of which should support pragma suppression regardless of whether they are relevant to the preprocessor or not. Do they all need this kind of handling, or is it only -Wc++20-compat that has some kind of problem? I had only checked -Wc++20-compat when working on the patch. I did some spot checking now and confirmed that suppression works as expected for C++ for at least the following warnings: -Wuninitialized -Warray-compare -Wbool-compare -Wtautological-compare -Wterminate I don't know the diagnostic framework well. As best I can tell, this issue is specific to the -Wc++20-compat option and when the particular diagnostic is issued (e.g., during lexing as opposed to during parsing). The following call chains appear to be relevant. cp_lexer_new_main -> cp_lexer_handle_early_pragma -> c_invoke_early_pragma_handler cp_parser_* -> cp_parser_pragma -> c_invoke_pragma_handler (where * might be "declaration", "toplevel_declaration", "class_head", "objc_interstitial_code", ...) The -Wc++20-compat enabled warning regarding new keywords in C++20 is issued from cp_lexer_get_preprocessor_token. Tom.
Re: [PATCH 1/3] C: Implement C2X N2653 char8_t and UTF-8 string literal changes
On 7/27/22 7:20 PM, Joseph Myers wrote: On Mon, 25 Jul 2022, Tom Honermann via Gcc-patches wrote: diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h index bfcfdf664c7..75ed7965689 100644 --- a/gcc/ginclude/stdatomic.h +++ b/gcc/ginclude/stdatomic.h @@ -49,6 +49,10 @@ typedef _Atomic long atomic_long; typedef _Atomic unsigned long atomic_ulong; typedef _Atomic long long atomic_llong; typedef _Atomic unsigned long long atomic_ullong; +#if (defined(__CHAR8_TYPE__) \ + && (defined(_GNU_SOURCE) || defined(_ISOC2X_SOURCE))) +typedef _Atomic __CHAR8_TYPE__ atomic_char8_t; +#endif typedef _Atomic __CHAR16_TYPE__ atomic_char16_t; typedef _Atomic __CHAR32_TYPE__ atomic_char32_t; typedef _Atomic __WCHAR_TYPE__ atomic_wchar_t; GCC headers don't test glibc feature test macros such as _GNU_SOURCE and _ISOC2X_SOURCE; they base things only on the standard version (whether directly, or indirectly as via __CHAR8_TYPE__) and standard-defined feature test macros. Ok, thank you, that makes sense. I'll follow up with a revised patch that removes the additional conditions. Tom. (There's one exception in glimits.h - testing __USE_GNU, the macro defined internally by glibc's headers - but I don't think that's something we want to emulate in new code.)
Re: [PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.
On 7/31/22 11:05 AM, Lewis Hyatt wrote: On Sat, Jul 30, 2022 at 7:06 PM Tom Honermann via Gcc-patches wrote: On 7/27/22 7:09 PM, Joseph Myers wrote: On Sun, 24 Jul 2022, Tom Honermann via Gcc-patches wrote: Gcc's '#pragma GCC diagnostic' directives are processed in "early mode" (see handle_pragma_diagnostic_early) for the C++ frontend and, as such, require that the target diagnostic option be enabled for the preprocessor (see c_option_is_from_cpp_diagnostics). This change modifies the -Wc++20-compat option definition to register it as a preprocessor option so that its associated diagnostics can be suppressed. The changes also There are lots of C++ warning options, all of which should support pragma suppression regardless of whether they are relevant to the preprocessor or not. Do they all need this kind of handling, or is it only -Wc++20-compat that has some kind of problem? I had only checked -Wc++20-compat when working on the patch. I did some spot checking now and confirmed that suppression works as expected for C++ for at least the following warnings: -Wuninitialized -Warray-compare -Wbool-compare -Wtautological-compare -Wterminate I don't know the diagnostic framework well. As best I can tell, this issue is specific to the -Wc++20-compat option and when the particular diagnostic is issued (e.g., during lexing as opposed to during parsing). The following call chains appear to be relevant. cp_lexer_new_main -> cp_lexer_handle_early_pragma -> c_invoke_early_pragma_handler cp_parser_* -> cp_parser_pragma -> c_invoke_pragma_handler (where * might be "declaration", "toplevel_declaration", "class_head", "objc_interstitial_code", ...) The -Wc++20-compat enabled warning regarding new keywords in C++20 is issued from cp_lexer_get_preprocessor_token. Tom. I have been working on improving the handling of "#pragma GCC diagnostic" lately. The behavior for C++ changed since r13-1544 (https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e46f4d7430c5210465791603735ab219ef263c51). I have some more comments about the patch's approach on the PR (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c44). "#pragma GCC diagnostic" formerly did not work in C++ at all, for diagnostics generated by libcpp, because C++ obtains all the tokens from libcpp first (including deferred pragmas), and then processes them afterward, too late to take effect for diagnostics that libcpp has already emitted. r13-1544 fixed this up by adding an early pragma handler, which runs as soon as a deferred pragma token is seen and handles diagnostic pragmas if they pertain to libcpp-controlled diagnostics. Non-libcpp diagnostics still need to be handled later, during parsing, or else they get processed too early and it leads to other problems. Basically, now each diagnostic pragma is handled as close in time as possible to the time the associated diagnostics might be generated. The early pragma handler determines that an option comes from libcpp, and so should be subject to early processing, if it was marked as such in the options definition file. Tom's patch points out that -Wc++20-compat needs to be handled early, and so marking it as a libcpp diagnostic in c-family/c.opt arranges for that to work as intended. Now one potential objection here is that -Wc++20-compat warnings are not technically generated by libcpp. They are generated by the C++ frontend immediately after lexing an identifier token from libcpp (cp_lexer_get_preprocessor_token()). But the distinction between these two steps is rather blurry and it seems logical to me, to denote this as a libcpp-related option. Also, the same is already done for -Wc++11-compat. Otherwise, we would need to add some new option property to indicate which ones need to be handled for pragmas at lexing time rather than parsing time. At the moment I don't see any other diagnostics issued from cp_lexer_get_preprocessor_token() that would need similar adjustments. Assuming the approach is OK, it might be nice to add a comment to that function, indicating that any diagnostics emitted there should be annotated as libcpp options in the .opt file? Thank you for those details; I wasn't aware of that history. If I'm interpreting your response correctly, it sounds like you agree with the direction of the patch. If you like, I can add a comment as you suggested and re-post the patch. Perhaps: diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 4f67441eeb1..c3584446827 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -924,7 +924,10 @@cp_lexer_saving_tokens (const cp_lexer* lexer) /* Store the next token from the preprocessor in *TOKEN. Return true if we reach EOF. If LEXER is NULL, assume we are handling an initial #pragma pch_preprocess, and thus want the lexer to return - processed strings. */ + processed strin
Re: [PATCH 2/3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes
On 7/27/22 7:23 PM, Joseph Myers wrote: On Mon, 25 Jul 2022, Tom Honermann via Gcc-patches wrote: This change provides new tests for the core language and compiler dependent library changes adopted for C2X via WG14 N2653. I'd expect this patch also to add tests verifying that u8"" strings have the old type for C11 (unless there are existing such tests, but I don't see them). Agreed, good catch. thank you. diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..37ea4c8926c --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,42 @@ +/* Test atomic_is_lock_free for char8_t. */ +/* { dg-do run } */ +/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */ I don't think _ISOC2X_SOURCE belongs in any GCC tests. That was necessary because the first patch in this series omitted the atomic_char8_t and ATOMIC_CHAR8_T_LOCK_FREE definitions unless one of _GNU_SOURCE or _ISOC2X_SOURCE was defined. Per review of that first patch, those conditions will be removed, so there will be no need to define them here. diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..a017b134817 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,5 @@ +/* Test atomic_is_lock_free for char8_t with -std=gnu2x. */ +/* { dg-do run } */ +/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */ Nor does _GNU_SOURCE (unless the test depends on glibc functionality that's only available with _GNU_SOURCE, but in that case you also need some effective-target conditionals to restrict it to appropriate glibc targets). Ditto. I'll post new patches shortly. Tom.
[PATCH 1/3 v2] C: Implement C2X N2653 char8_t and UTF-8 string literal changes
This patch implements the core language and compiler dependent library changes adopted for C2X via WG14 N2653. The changes include: - Change of type for UTF-8 string literals from array of const char to array of const char8_t (unsigned char). - A new atomic_char8_t typedef. - A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro. gcc/ChangeLog: * ginclude/stdatomic.h (atomic_char8_t, ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro. gcc/c/ChangeLog: * c-parser.c (c_parser_string_literal): Use char8_t as the type of CPP_UTF8STRING when char8_t support is enabled. * c-typeck.c (digest_init): Allow initialization of an array of character type by a string literal with type array of char8_t. gcc/c-family/ChangeLog: * c-lex.c (lex_string, lex_charconst): Use char8_t as the type of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is enabled. * c-opts.c (c_common_post_options): Set flag_char8_t if targeting C2x. --- gcc/c-family/c-lex.cc| 13 + gcc/c-family/c-opts.cc | 4 ++-- gcc/c/c-parser.cc| 16 ++-- gcc/c/c-typeck.cc| 2 +- gcc/ginclude/stdatomic.h | 6 ++ 5 files changed, 32 insertions(+), 9 deletions(-) diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc index 8bfa4f4024f..0b6f94e18a8 100644 --- a/gcc/c-family/c-lex.cc +++ b/gcc/c-family/c-lex.cc @@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate) default: case CPP_STRING: case CPP_UTF8STRING: - value = build_string (1, ""); + if (type == CPP_UTF8STRING && flag_char8_t) + { + value = build_string (TYPE_PRECISION (char8_type_node) + / TYPE_PRECISION (char_type_node), + ""); /* char8_t is 8 bits */ + } + else + value = build_string (1, ""); break; case CPP_STRING16: value = build_string (TYPE_PRECISION (char16_type_node) @@ -1425,9 +1432,7 @@ lex_charconst (const cpp_token *token) type = char16_type_node; else if (token->type == CPP_UTF8CHAR) { - if (!c_dialect_cxx ()) - type = unsigned_char_type_node; - else if (flag_char8_t) + if (flag_char8_t) type = char8_type_node; else type = char_type_node; diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index b9f01a65ed7..108adc5caf8 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1059,9 +1059,9 @@ c_common_post_options (const char **pfilename) if (flag_sized_deallocation == -1) flag_sized_deallocation = (cxx_dialect >= cxx14); - /* char8_t support is new in C++20. */ + /* char8_t support is implicitly enabled in C++20 and C2X. */ if (flag_char8_t == -1) -flag_char8_t = (cxx_dialect >= cxx20); +flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x; if (flag_extern_tls_init) { diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 92049d1a101..fa9395986de 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -7447,7 +7447,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok) default: case CPP_STRING: case CPP_UTF8STRING: - value = build_string (1, ""); + if (type == CPP_UTF8STRING && flag_char8_t) + { + value = build_string (TYPE_PRECISION (char8_type_node) + / TYPE_PRECISION (char_type_node), + ""); /* char8_t is 8 bits */ + } + else + value = build_string (1, ""); break; case CPP_STRING16: value = build_string (TYPE_PRECISION (char16_type_node) @@ -7472,9 +7479,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok) { default: case CPP_STRING: -case CPP_UTF8STRING: TREE_TYPE (value) = char_array_type_node; break; +case CPP_UTF8STRING: + if (flag_char8_t) + TREE_TYPE (value) = char8_array_type_node; + else + TREE_TYPE (value) = char_array_type_node; + break; case CPP_STRING16: TREE_TYPE (value) = char16_array_type_node; break; diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index fd0a7f81a7a..231f4e980b6 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -8045,7 +8045,7 @@ digest_init (location_t init_loc, tree type, tree init, tree origtype, if (char_array) { - if (typ2 != char_type_node) + if (typ2 != char_type_node && typ2 != char8_type_node) incompat_string_cst = true; } else if (!comptypes (typ1, typ2)) diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h index bfcfdf664c7..9f2475b739d 100644 --
[PATCH 2/3 v2] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes
This change provides new tests for the core language and compiler dependent library changes adopted for C2X via WG14 N2653. gcc/testsuite/ChangeLog: * gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/c2x-predefined-macros.c: New test. * gcc.dg/c2x-utf8str-type.c: New test. * gcc.dg/c2x-utf8str.c: New test. * gcc.dg/gnu2x-predefined-macros.c: New test. * gcc.dg/gnu2x-utf8str-type.c: New test. * gcc.dg/gnu2x-utf8str.c: New test. --- .../atomic/c2x-stdatomic-lockfree-char8_t.c | 42 +++ .../atomic/gnu2x-stdatomic-lockfree-char8_t.c | 5 +++ gcc/testsuite/gcc.dg/c11-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c17-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-predefined-macros.c | 11 + gcc/testsuite/gcc.dg/c2x-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++ .../gcc.dg/gnu2x-predefined-macros.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str.c | 34 +++ 10 files changed, 154 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-predefined-macros.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..1b692f55ed0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,42 @@ +/* Test atomic_is_lock_free for char8_t. */ +/* { dg-do run } */ +/* { dg-options "-std=c2x -pedantic-errors" } */ + +#include +#include + +extern void abort (void); + +_Atomic __CHAR8_TYPE__ ac8a; +atomic_char8_t ac8t; + +#define CHECK_TYPE(MACRO, V1, V2) \ + do \ +{ \ + int r1 = MACRO; \ + int r2 = atomic_is_lock_free (&V1); \ + int r3 = atomic_is_lock_free (&V2); \ + if (r1 != 0 && r1 != 1 && r1 != 2) \ + abort (); \ + if (r2 != 0 && r2 != 1) \ + abort (); \ + if (r3 != 0 && r3 != 1) \ + abort (); \ + if (r1 == 2 && r2 != 1) \ + abort (); \ + if (r1 == 2 && r3 != 1) \ + abort (); \ + if (r1 == 0 && r2 != 0) \ + abort (); \ + if (r1 == 0 && r3 != 0) \ + abort (); \ +} \ + while (0) + +int +main () +{ + CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..27a3cfe3552 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,5 @@ +/* Test atomic_is_lock_free for char8_t with -std=gnu2x. */ +/* { dg-do run } */ +/* { dg-options "-std=gnu2x -pedantic-errors" } */ + +#include "c2x-stdatomic-lockfree-char8_t.c" diff --git a/gcc/testsuite/gcc.dg/c11-utf8str-type.c b/gcc/testsuite/gcc.dg/c11-utf8str-type.c new file mode 100644 index 000..8be9abb9686 --- /dev/null +++ b/gcc/testsuite/gcc.dg/c11-utf8str-type.c @@ -0,0 +1,6 @@ +/* Test C11 UTF-8 string literal type. */ +/* { dg-do compile } */ +/* { dg-options "-std=c11" } */ + +_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string literals have an unexpected type"); +_Static_assert (_Generic (u8"x"[0], char: 1, default: 2) == 1, "UTF-8 string literal elements have an unexpected type"); diff --git a/gcc/testsuite/gcc.dg/c17-utf8str-type.c b/gcc/testsuite/gcc.dg/c17-utf8str-type.c new file mode 100644 index 000..515c6db3970 --- /dev/null +++ b/gcc/testsuite/gcc.dg/c17-utf8str-type.c @@ -0,0 +1,6 @@ +/* Test C17 UTF-8 string literal type. */ +/* { dg-do compile } */ +/* { dg-options "-std=c17" } */ + +_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string literals
[PATCH 1/1 v2] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.
Gcc's '#pragma GCC diagnostic' directives are processed in "early mode" (see handle_pragma_diagnostic_early) for the C++ frontend and, as such, require that the target diagnostic option be enabled for the preprocessor (see c_option_is_from_cpp_diagnostics). This change modifies the -Wc++20-compat option definition to register it as a preprocessor option so that its associated diagnostics can be suppressed. The changes also implicitly disable the option in C++20 and later modes. These changes are consistent with the definition of the -Wc++11-compat option. This support is motivated by the need to suppress the following diagnostic otherwise issued in C++17 and earlier modes due to the char8_t typedef present in the uchar.h header file in glibc 2.36. warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat] Tests are added to validate suppression of both -Wc++11-compat and -Wc++20-compat related diagnostics (fixes were only needed for the C++20 case). Fixes https://gcc.gnu.org/PR106423. gcc/c-family/ChangeLog: * c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics in C++20 and later. * c.opt (Wc++20-compat): Enable hooks for the preprocessor. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/keywords2.C: New test. * g++.dg/cpp2a/keywords2.C: New test. libcpp/ChangeLog: * include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT. * init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat. --- gcc/c-family/c-opts.cc | 7 +++ gcc/c-family/c.opt | 2 +- gcc/cp/parser.cc | 5 - gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 + libcpp/include/cpplib.h| 4 libcpp/init.cc | 1 + 7 files changed, 46 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index b9f01a65ed7..1ea37ba9742 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename) else if (warn_narrowing == -1) warn_narrowing = 0; + if (cxx_dialect >= cxx20) +{ + /* Don't warn about C++20 compatibility changes in C++20 or later. */ + warn_cxx20_compat = 0; + cpp_opts->cpp_warn_cxx20_compat = 0; +} + /* C++17 has stricter evaluation order requirements; let's use some of them for earlier C++ as well, so chaining works as expected. */ if (c_dialect_cxx () diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 44e1a60ce24..dfdebd596ef 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -455,7 +455,7 @@ Wc++2a-compat C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented Wc++20-compat -C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) +C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT) Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO C++ 2020. Wc++11-extensions diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 4f67441eeb1..c3584446827 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -924,7 +924,10 @@ cp_lexer_saving_tokens (const cp_lexer* lexer) /* Store the next token from the preprocessor in *TOKEN. Return true if we reach EOF. If LEXER is NULL, assume we are handling an initial #pragma pch_preprocess, and thus want the lexer to return - processed strings. */ + processed strings. + + Diagnostics issued from this function must have their controlling option (if + any) in c.opt annotated as a libcpp option via the CppReason property. */ static void cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token) diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C b/gcc/testsuite/g++.dg/cpp0x/keywords2.C new file mode 100644 index 000..d67d01e31ed --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C @@ -0,0 +1,16 @@ +// { dg-do compile { target c++98_only } } +// { dg-options "-Wc++11-compat" } + +// Validate suppression of -Wc++11-compat diagnostics. +#pragma GCC diagnostic ignored "-Wc++11-compat" +int alignof; +int alignas; +int constexpr; +int decltype; +int noexcept; +int nullptr; +int static_assert; +int thread_local; +int _Alignas; +int _Alignof; +int _Thread_local; diff --git a/gcc/testsuite/g++.dg/cpp2a/keywords2.C b/gcc/testsuite/g++.dg/cpp2a/keywords2.C new file mode 100644 index 000..8714a7b26b7 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/keywords2.C @@ -0,0 +1,13 @@ +// { dg-do compile { target c++17_down } } +// { dg-options "-Wc++20-compat" } + +// Validate suppression of -Wc++20-compat diagnostics. +#pragma GCC diagnostic ignored "-Wc++20-compat" +int constinit; +int consteval; +int re
Re: [PATCH 2/3 v2] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes
On 8/1/22 3:13 PM, Joseph Myers wrote: On Mon, 1 Aug 2022, Tom Honermann via Gcc-patches wrote: diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c new file mode 100644 index 000..3456105563a --- /dev/null +++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c @@ -0,0 +1,11 @@ +/* Test C2X predefined macros. */ +/* { dg-do compile } */ +/* { dg-options "-std=c2x" } */ + +#if !defined(__CHAR8_TYPE__) +# error __CHAR8_TYPE__ is not defined! +#endif + +#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE) +# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined! +#endif These aren't macros defined by C2X. You could argue that they are part of the stable interface provided by GCC for e.g. libc implementations to use, and so should be tested as such, but any such test shouldn't suggest it's testing a standard feature (and should have a better name to describe what it's actually testing rather than suggesting it's about predefined macros in general). Fair point. This test is redundant anyway; these macros are directly or indirectly exercised by the other tests. I'll just remove it. Tom.
[PATCH 2/3 v3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes
This change provides new tests for the core language and compiler dependent library changes adopted for C2X via WG14 N2653. gcc/testsuite/ChangeLog: * gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/c2x-predefined-macros.c: New test. * gcc.dg/c2x-utf8str-type.c: New test. * gcc.dg/c2x-utf8str.c: New test. * gcc.dg/gnu2x-predefined-macros.c: New test. * gcc.dg/gnu2x-utf8str-type.c: New test. * gcc.dg/gnu2x-utf8str.c: New test. --- .../atomic/c2x-stdatomic-lockfree-char8_t.c | 42 +++ .../atomic/gnu2x-stdatomic-lockfree-char8_t.c | 5 +++ gcc/testsuite/gcc.dg/c11-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c17-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++ .../gcc.dg/gnu2x-predefined-macros.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str.c | 34 +++ 9 files changed, 143 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..1b692f55ed0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,42 @@ +/* Test atomic_is_lock_free for char8_t. */ +/* { dg-do run } */ +/* { dg-options "-std=c2x -pedantic-errors" } */ + +#include +#include + +extern void abort (void); + +_Atomic __CHAR8_TYPE__ ac8a; +atomic_char8_t ac8t; + +#define CHECK_TYPE(MACRO, V1, V2) \ + do \ +{ \ + int r1 = MACRO; \ + int r2 = atomic_is_lock_free (&V1); \ + int r3 = atomic_is_lock_free (&V2); \ + if (r1 != 0 && r1 != 1 && r1 != 2) \ + abort (); \ + if (r2 != 0 && r2 != 1) \ + abort (); \ + if (r3 != 0 && r3 != 1) \ + abort (); \ + if (r1 == 2 && r2 != 1) \ + abort (); \ + if (r1 == 2 && r3 != 1) \ + abort (); \ + if (r1 == 0 && r2 != 0) \ + abort (); \ + if (r1 == 0 && r3 != 0) \ + abort (); \ +} \ + while (0) + +int +main () +{ + CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..27a3cfe3552 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,5 @@ +/* Test atomic_is_lock_free for char8_t with -std=gnu2x. */ +/* { dg-do run } */ +/* { dg-options "-std=gnu2x -pedantic-errors" } */ + +#include "c2x-stdatomic-lockfree-char8_t.c" diff --git a/gcc/testsuite/gcc.dg/c11-utf8str-type.c b/gcc/testsuite/gcc.dg/c11-utf8str-type.c new file mode 100644 index 000..8be9abb9686 --- /dev/null +++ b/gcc/testsuite/gcc.dg/c11-utf8str-type.c @@ -0,0 +1,6 @@ +/* Test C11 UTF-8 string literal type. */ +/* { dg-do compile } */ +/* { dg-options "-std=c11" } */ + +_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string literals have an unexpected type"); +_Static_assert (_Generic (u8"x"[0], char: 1, default: 2) == 1, "UTF-8 string literal elements have an unexpected type"); diff --git a/gcc/testsuite/gcc.dg/c17-utf8str-type.c b/gcc/testsuite/gcc.dg/c17-utf8str-type.c new file mode 100644 index 000..515c6db3970 --- /dev/null +++ b/gcc/testsuite/gcc.dg/c17-utf8str-type.c @@ -0,0 +1,6 @@ +/* Test C17 UTF-8 string literal type. */ +/* { dg-do compile } */ +/* { dg-options "-std=c17" } */ + +_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string literals have an unexpected type"); +_Static_assert (_Generic (u8"x"[0], char: 1, default: 2) == 1, "UTF-8 string literal elements
Re: [PATCH 2/3 v3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes
On 8/2/22 12:53 PM, Joseph Myers wrote: On Mon, 1 Aug 2022, Tom Honermann via Gcc-patches wrote: This change provides new tests for the core language and compiler dependent library changes adopted for C2X via WG14 N2653. Could you please send a complete patch series? I'm not sure what the matching patches 1 and 3 are. Also, I don't generally find it helpful for tests to be separated from the patch making the changes they test, since tests are necessary to review of that code. Absolutely. I'll merge the implementation and test commits, so the next series (v4) will have just two commits; one for the C2X N2653 implementation and the other for the C++ u8 preprocessor string type fix. Coming right up. Tom.
[PATCH v4 0/2] Implement C2X N2653 (char8_t) and correct UTF-8 character literal type in preprocessor directives for C++
This patch series provides an implementation and tests for the WG14 N2653 paper as adopted for C2X. Additionally, a fix is included for the C++ preprocessor to treat UTF-8 character literals in preprocessor directives as an unsigned type in char8_t enabled modes (in C++17 and earlier with -fchar8_t or in C++20 or later without -fno-char8_t). Tom Honermann (2): C: Implement C2X N2653 char8_t and UTF-8 string literal changes preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes. gcc/c-family/c-lex.cc | 13 -- gcc/c-family/c-opts.cc| 5 ++- gcc/c/c-parser.cc | 16 ++- gcc/c/c-typeck.cc | 2 +- gcc/ginclude/stdatomic.h | 6 +++ .../g++.dg/ext/char8_t-char-literal-1.C | 6 ++- .../g++.dg/ext/char8_t-char-literal-2.C | 4 ++ .../atomic/c2x-stdatomic-lockfree-char8_t.c | 42 +++ .../atomic/gnu2x-stdatomic-lockfree-char8_t.c | 5 +++ gcc/testsuite/gcc.dg/c11-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c17-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str.c | 34 +++ libcpp/charset.cc | 4 +- libcpp/include/cpplib.h | 4 +- libcpp/init.cc| 1 + 18 files changed, 185 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c -- 2.32.0
[PATCH v4 1/2] C: Implement C2X N2653 char8_t and UTF-8 string literal changes
This patch implements the core language and compiler dependent library changes adopted for C2X via WG14 N2653. The changes include: - Change of type for UTF-8 string literals from array of const char to array of const char8_t (unsigned char). - A new atomic_char8_t typedef. - A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro. gcc/ChangeLog: * ginclude/stdatomic.h (atomic_char8_t, ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro. gcc/c/ChangeLog: * c-parser.c (c_parser_string_literal): Use char8_t as the type of CPP_UTF8STRING when char8_t support is enabled. * c-typeck.c (digest_init): Allow initialization of an array of character type by a string literal with type array of char8_t. gcc/c-family/ChangeLog: * c-lex.c (lex_string, lex_charconst): Use char8_t as the type of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is enabled. * c-opts.c (c_common_post_options): Set flag_char8_t if targeting C2x. gcc/testsuite/ChangeLog: * gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/c11-utf8str-type.c: New test. * gcc.dg/c17-utf8str-type.c: New test. * gcc.dg/c2x-utf8str-type.c: New test. * gcc.dg/c2x-utf8str.c: New test. * gcc.dg/gnu2x-utf8str-type.c: New test. * gcc.dg/gnu2x-utf8str.c: New test. --- gcc/c-family/c-lex.cc | 13 -- gcc/c-family/c-opts.cc| 4 +- gcc/c/c-parser.cc | 16 ++- gcc/c/c-typeck.cc | 2 +- gcc/ginclude/stdatomic.h | 6 +++ .../atomic/c2x-stdatomic-lockfree-char8_t.c | 42 +++ .../atomic/gnu2x-stdatomic-lockfree-char8_t.c | 5 +++ gcc/testsuite/gcc.dg/c11-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c17-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-utf8str-type.c | 6 +++ gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c | 5 +++ gcc/testsuite/gcc.dg/gnu2x-utf8str.c | 34 +++ 13 files changed, 170 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc index 8bfa4f4024f..0b6f94e18a8 100644 --- a/gcc/c-family/c-lex.cc +++ b/gcc/c-family/c-lex.cc @@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate) default: case CPP_STRING: case CPP_UTF8STRING: - value = build_string (1, ""); + if (type == CPP_UTF8STRING && flag_char8_t) + { + value = build_string (TYPE_PRECISION (char8_type_node) + / TYPE_PRECISION (char_type_node), + ""); /* char8_t is 8 bits */ + } + else + value = build_string (1, ""); break; case CPP_STRING16: value = build_string (TYPE_PRECISION (char16_type_node) @@ -1425,9 +1432,7 @@ lex_charconst (const cpp_token *token) type = char16_type_node; else if (token->type == CPP_UTF8CHAR) { - if (!c_dialect_cxx ()) - type = unsigned_char_type_node; - else if (flag_char8_t) + if (flag_char8_t) type = char8_type_node; else type = char_type_node; diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index b9f01a65ed7..108adc5caf8 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1059,9 +1059,9 @@ c_common_post_options (const char **pfilename) if (flag_sized_deallocation == -1) flag_sized_deallocation = (cxx_dialect >= cxx14); - /* char8_t support is new in C++20. */ + /* char8_t support is implicitly enabled in C++20 and C2X. */ if (flag_char8_t == -1) -flag_char8_t = (cxx_dialect >= cxx20); +flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x; if (flag_extern_tls_init) { diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 92049d1a101..fa9395986de 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -7447,7 +7447,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok) default: case CPP_STRING: case CPP_UTF8STRING: - value = build_string (1, ""); +
[PATCH v4 2/2] preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.
This patch corrects handling of UTF-8 character literals in preprocessing directives so that they are treated as unsigned types in char8_t enabled C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously, UTF-8 character literals were always treated as having the same type as ordinary character literals (signed or unsigned dependent on target or use of the -fsigned-char or -funsigned char options). PR preprocessor/106426 gcc/c-family/ChangeLog: * c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char subject to -fchar8_t, -fsigned-char, and/or -funsigned-char. gcc/testsuite/ChangeLog: * g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals. * g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals. libcpp/ChangeLog: * charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR literals based on unsigned_utf8char. * include/cpplib.h (cpp_options): Add unsigned_utf8char. * init.cc (cpp_create_reader): Initialize unsigned_utf8char. --- gcc/c-family/c-opts.cc| 1 + gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C | 6 +- gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C | 4 libcpp/charset.cc | 4 ++-- libcpp/include/cpplib.h | 4 ++-- libcpp/init.cc| 1 + 6 files changed, 15 insertions(+), 5 deletions(-) diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index 108adc5caf8..02ce1e86cdb 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1062,6 +1062,7 @@ c_common_post_options (const char **pfilename) /* char8_t support is implicitly enabled in C++20 and C2X. */ if (flag_char8_t == -1) flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x; + cpp_opts->unsigned_utf8char = flag_char8_t ? 1 : cpp_opts->unsigned_char; if (flag_extern_tls_init) { diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C index 8ed85ccfdcd..2994dd38516 100644 --- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C +++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C @@ -1,6 +1,6 @@ // Test that UTF-8 character literals have type char if -fchar8_t is not enabled. // { dg-do compile } -// { dg-options "-std=c++17 -fno-char8_t" } +// { dg-options "-std=c++17 -fsigned-char -fno-char8_t" } template struct is_same @@ -10,3 +10,7 @@ template { static const bool value = true; }; static_assert(is_same::value, "Error"); + +#if u8'\0' - 1 > 0 +#error "UTF-8 character literals not signed in preprocessor" +#endif diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C index 7861736689c..db4fe70046d 100644 --- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C +++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C @@ -10,3 +10,7 @@ template { static const bool value = true; }; static_assert(is_same::value, "Error"); + +#if u8'\0' - 1 < 0 +#error "UTF-8 character literals not unsigned in preprocessor" +#endif diff --git a/libcpp/charset.cc b/libcpp/charset.cc index ca8b7cf7aa5..12e31632228 100644 --- a/libcpp/charset.cc +++ b/libcpp/charset.cc @@ -1960,8 +1960,8 @@ narrow_str_to_charconst (cpp_reader *pfile, cpp_string str, /* Multichar constants are of type int and therefore signed. */ if (i > 1) unsigned_p = 0; - else if (type == CPP_UTF8CHAR && !CPP_OPTION (pfile, cplusplus)) -unsigned_p = 1; + else if (type == CPP_UTF8CHAR) +unsigned_p = CPP_OPTION (pfile, unsigned_utf8char); else unsigned_p = CPP_OPTION (pfile, unsigned_char); diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index 3eba6f74b57..f9c042db034 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -581,8 +581,8 @@ struct cpp_options ints and target wide characters, respectively. */ size_t precision, char_precision, int_precision, wchar_precision; - /* True means chars (wide chars) are unsigned. */ - bool unsigned_char, unsigned_wchar; + /* True means chars (wide chars, UTF-8 chars) are unsigned. */ + bool unsigned_char, unsigned_wchar, unsigned_utf8char; /* True if the most significant byte in a word has the lowest address in memory. */ diff --git a/libcpp/init.cc b/libcpp/init.cc index f4ab83d2145..0242da5f55c 100644 --- a/libcpp/init.cc +++ b/libcpp/init.cc @@ -231,6 +231,7 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table, CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int); CPP_OPTION (pfile, unsigned_char) = 0; CPP_OPTION (pfile, unsigned_wchar) = 1; + CPP_OPTION (pfile, unsigned_utf8char) = 1; CPP_OPTION (pfile, bytes_big_endian) = 1; /* does not matter */ /* Default to no charset conversion. */ -- 2.32.0
Re: [PATCH 1/1 v2] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.
Are there any further concerns with this patch? If not, I extend my gratitude to anyone so kind as to commit this for me as I don't have commit access. I just noticed that I neglected to add a ChangeLog entry for the comment addition to gcc/cp/parser.cc. Noted inline below. I can re-send the patch with that update if desired. Tom. On 8/1/22 2:49 PM, Tom Honermann wrote: Gcc's '#pragma GCC diagnostic' directives are processed in "early mode" (see handle_pragma_diagnostic_early) for the C++ frontend and, as such, require that the target diagnostic option be enabled for the preprocessor (see c_option_is_from_cpp_diagnostics). This change modifies the -Wc++20-compat option definition to register it as a preprocessor option so that its associated diagnostics can be suppressed. The changes also implicitly disable the option in C++20 and later modes. These changes are consistent with the definition of the -Wc++11-compat option. This support is motivated by the need to suppress the following diagnostic otherwise issued in C++17 and earlier modes due to the char8_t typedef present in the uchar.h header file in glibc 2.36. warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat] Tests are added to validate suppression of both -Wc++11-compat and -Wc++20-compat related diagnostics (fixes were only needed for the C++20 case). Fixeshttps://gcc.gnu.org/PR106423. gcc/c-family/ChangeLog: * c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics in C++20 and later. * c.opt (Wc++20-compat): Enable hooks for the preprocessor. gcc/cp/ChangeLog: * parser.cc (cp_lexer_saving_tokens): Add comment regarding diagnostic requirements. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/keywords2.C: New test. * g++.dg/cpp2a/keywords2.C: New test. libcpp/ChangeLog: * include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT. * init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat. --- gcc/c-family/c-opts.cc | 7 +++ gcc/c-family/c.opt | 2 +- gcc/cp/parser.cc | 5 - gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 + libcpp/include/cpplib.h| 4 libcpp/init.cc | 1 + 7 files changed, 46 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index b9f01a65ed7..1ea37ba9742 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename) else if (warn_narrowing == -1) warn_narrowing = 0; + if (cxx_dialect >= cxx20) +{ + /* Don't warn about C++20 compatibility changes in C++20 or later. */ + warn_cxx20_compat = 0; + cpp_opts->cpp_warn_cxx20_compat = 0; +} + /* C++17 has stricter evaluation order requirements; let's use some of them for earlier C++ as well, so chaining works as expected. */ if (c_dialect_cxx () diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 44e1a60ce24..dfdebd596ef 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -455,7 +455,7 @@ Wc++2a-compat C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented Wc++20-compat -C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) +C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT) Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO C++ 2020. Wc++11-extensions diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 4f67441eeb1..c3584446827 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -924,7 +924,10 @@ cp_lexer_saving_tokens (const cp_lexer* lexer) /* Store the next token from the preprocessor in *TOKEN. Return true if we reach EOF. If LEXER is NULL, assume we are handling an initial #pragma pch_preprocess, and thus want the lexer to return - processed strings. */ + processed strings. + + Diagnostics issued from this function must have their controlling option (if + any) in c.opt annotated as a libcpp option via the CppReason property. */ static void cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token) diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C b/gcc/testsuite/g++.dg/cpp0x/keywords2.C new file mode 100644 index 000..d67d01e31ed --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C @@ -0,0 +1,16 @@ +// { dg-do compile { target c++98_only } } +// { dg-options "-Wc++11-compat" } + +// Validate suppression of -Wc++11-compat diagnostics. +#pragma GCC diagnostic ignored "-Wc++11-compat" +int alignof; +int alignas
Re: [PATCH v4 2/2] preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.
On 8/2/22 6:14 PM, Joseph Myers wrote: On Tue, 2 Aug 2022, Tom Honermann via Gcc-patches wrote: This patch corrects handling of UTF-8 character literals in preprocessing directives so that they are treated as unsigned types in char8_t enabled C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously, UTF-8 character literals were always treated as having the same type as ordinary character literals (signed or unsigned dependent on target or use of the -fsigned-char or -funsigned char options). OK in the absence of C++ maintainer objections within 72 hours. (This is the case where, when I added support for such literals for C (commit 7c5890cc0a0ecea0e88cc39e9fba6385fb579e61), I raised the question of whether they should be unsigned in the preprocessor for C++ as well.) Joseph, would you be so kind as to commit this patch series for me? I don't have commit access. Thank you in advance! Tom.
Re: [PATCH 1/1 v2] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.
If there are no further concerns, could a C++ or libcpp maintainer please commit this for me? Thank you! Tom. On 8/4/22 12:42 PM, Tom Honermann via Gcc-patches wrote: Are there any further concerns with this patch? If not, I extend my gratitude to anyone so kind as to commit this for me as I don't have commit access. I just noticed that I neglected to add a ChangeLog entry for the comment addition to gcc/cp/parser.cc. Noted inline below. I can re-send the patch with that update if desired. Tom. On 8/1/22 2:49 PM, Tom Honermann wrote: Gcc's '#pragma GCC diagnostic' directives are processed in "early mode" (see handle_pragma_diagnostic_early) for the C++ frontend and, as such, require that the target diagnostic option be enabled for the preprocessor (see c_option_is_from_cpp_diagnostics). This change modifies the -Wc++20-compat option definition to register it as a preprocessor option so that its associated diagnostics can be suppressed. The changes also implicitly disable the option in C++20 and later modes. These changes are consistent with the definition of the -Wc++11-compat option. This support is motivated by the need to suppress the following diagnostic otherwise issued in C++17 and earlier modes due to the char8_t typedef present in the uchar.h header file in glibc 2.36. warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat] Tests are added to validate suppression of both -Wc++11-compat and -Wc++20-compat related diagnostics (fixes were only needed for the C++20 case). Fixeshttps://gcc.gnu.org/PR106423. gcc/c-family/ChangeLog: * c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics in C++20 and later. * c.opt (Wc++20-compat): Enable hooks for the preprocessor. gcc/cp/ChangeLog: * parser.cc (cp_lexer_saving_tokens): Add comment regarding diagnostic requirements. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/keywords2.C: New test. * g++.dg/cpp2a/keywords2.C: New test. libcpp/ChangeLog: * include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT. * init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat. --- gcc/c-family/c-opts.cc | 7 +++ gcc/c-family/c.opt | 2 +- gcc/cp/parser.cc | 5 - gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 + libcpp/include/cpplib.h | 4 libcpp/init.cc | 1 + 7 files changed, 46 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index b9f01a65ed7..1ea37ba9742 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename) else if (warn_narrowing == -1) warn_narrowing = 0; + if (cxx_dialect >= cxx20) + { + /* Don't warn about C++20 compatibility changes in C++20 or later. */ + warn_cxx20_compat = 0; + cpp_opts->cpp_warn_cxx20_compat = 0; + } + /* C++17 has stricter evaluation order requirements; let's use some of them for earlier C++ as well, so chaining works as expected. */ if (c_dialect_cxx () diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 44e1a60ce24..dfdebd596ef 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -455,7 +455,7 @@ Wc++2a-compat C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented Wc++20-compat -C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) +C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT) Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO C++ 2020. Wc++11-extensions diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 4f67441eeb1..c3584446827 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -924,7 +924,10 @@ cp_lexer_saving_tokens (const cp_lexer* lexer) /* Store the next token from the preprocessor in *TOKEN. Return true if we reach EOF. If LEXER is NULL, assume we are handling an initial #pragma pch_preprocess, and thus want the lexer to return - processed strings. */ + processed strings. + + Diagnostics issued from this function must have their controlling option (if + any) in c.opt annotated as a libcpp option via the CppReason property. */ static void cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token) diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C b/gcc/testsuite/g++.dg/cpp0x/keywords2.C new file mode 100644 index 000..d67d01e31ed --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C @@ -0,0 +1,16 @@ +// { dg-do compile { target c++98_only } } +// { dg-options "-Wc++11-com
[PATCH 0/3]: C N2653 char8_t implementation
This series of patches implements the core language features for the WG14 N2653 [1] proposal to provide char8_t support in C. These changes are intended to align char8_t support in C with the support provided in C++20 via WG21 P0482R6 [2]. These changes do not impact default gcc behavior. The existing -fchar8_t option is extended to C compilation to enable the N2653 changes, and -fno-char8_t is extended to explicitly disable them. N2653 has not yet been accepted by WG14, so no changes are made to handling of the C2X language dialect. Patch 1: Language support Patch 2: New tests Patch 3: Documentation updates Tom. [1]: WG14 N2653 "char8_t: A type for UTF-8 characters and strings (Revision 1)" http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm [2]: WG21 P0482R6 "char8_t: A type for UTF-8 characters and strings (Revision 6)" https://wg21.link/p0482r6
[PATCH 1/3]: C N2653 char8_t: Language support
This patch implements the core language and compiler dependent library changes proposed in WG14 N2653 [1] for C. The changes include: - Use of the existing -fchar8_t and -fno-char8_t options to opt-in to (or opt-out of) the following changes when compiling C code. - Change of type for UTF-8 string literals from array of char to array of char8_t (unsigned char). - A new atomic_char8_t typedef. - A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of a new predefined ATOMIC_CHAR8_T_LOCK_FREE macro. When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE macro is predefined. This is the mechanism proposed to glibc to opt-in to declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions proposed in N2653. See [2]. Tested on Linux x86_64. gcc/ChangeLog: 2021-05-31 Tom Honermann * ginclude/stdatomic.h (atomic_char8_t, ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro. gcc/c/ChangeLog: 2021-05-31 Tom Honermann * c-parser.c (c_parser_string_literal): Use char8_t as the type of CPP_UTF8STRING when char8_t support is enabled. * c-typeck.c (digest_init): Handle initialization of an array of character type by a string literal with type array of unsigned char. gcc/c-family/ChangeLog: 2021-05-31 Tom Honermann * c-cppbuiltin.c (c_cpp_builtins): Define _CHAR8_T_SOURCE if char8_t support is enabled in non-C++ language modes. * c-lex.c (lex_string): Use char8_t as the type of CPP_UTF8STRING when char8_t support is enabled. * c-opts.c (c_common_handle_option): Inform the preprocessor if char8_t support is enabled. * c.opt (fchar8_t): Enable for C language modes. libcpp/ChangeLog: 2021-05-31 Tom Honermann * include/cpplib.h (cpp_options): Add char8. Tom. [1]: WG14 N2653 "char8_t: A type for UTF-8 characters and strings (Revision 1)" http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm [2]: C++20 P0482R6 and C2X N2653: support for char8_t, mbrtoc8(), and c8rtomb(). [Patch 0]: https://sourceware.org/pipermail/libc-alpha/2021-June/127230.html [Patch 1]: https://sourceware.org/pipermail/libc-alpha/2021-June/127231.html [Patch 2]: https://sourceware.org/pipermail/libc-alpha/2021-June/127232.html [Patch 3]: https://sourceware.org/pipermail/libc-alpha/2021-June/127233.html commit c4260c7c49822522945377cc2fb93ee9830cefc8 Author: Tom Honermann Date: Sat Feb 13 09:02:34 2021 -0500 N2653 char8_t for C: Language support This patch implements the core language and compiler dependent library changes proposed in WG14 N2653 for C. The changes include: - Use of the existing -fchar8_t and -fno-char8_t options to opt-in to (or opt-out of) the following changes when compiling C code. - Change of type for UTF-8 string literals from array of const char to array of const char8_t (unsigned char). - A new atomic_char8_t typedef. - A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of a new predefined ATOMIC_CHAR8_T_LOCK_FREE macro. When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE macro is predefined. This is the mechanism proposed to glibc to opt-in to declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions proposed in N2653. diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c index 42b7604c9ac..3e944ec2b86 100644 --- a/gcc/c-family/c-cppbuiltin.c +++ b/gcc/c-family/c-cppbuiltin.c @@ -1467,6 +1467,11 @@ c_cpp_builtins (cpp_reader *pfile) if (flag_iso) cpp_define (pfile, "__STRICT_ANSI__"); + /* Express intent for char8_t support in C (not C++) to the C library if + requested. */ + if (!c_dialect_cxx () && flag_char8_t) +cpp_define (pfile, "_CHAR8_T_SOURCE"); + if (!flag_signed_char) cpp_define (pfile, "__CHAR_UNSIGNED__"); diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c index c44e7a13489..e30e44e9f5c 100644 --- a/gcc/c-family/c-lex.c +++ b/gcc/c-family/c-lex.c @@ -1335,7 +1335,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate) default: case CPP_STRING: case CPP_UTF8STRING: - value = build_string (1, ""); + if (type == CPP_UTF8STRING && flag_char8_t) + { + value = build_string (TYPE_PRECISION (char8_type_node) +/ TYPE_PRECISION (char_type_node), +""); /* char8_t is 8 bits */ + } + else + value = build_string (1, ""); break; case CPP_STRING16: value = build_string (TYPE_PRECISION (char16_type_node) diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index 60b5802722c..eefc607dac6 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -718,6 +718,10 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT
[PATCH 2/3]: C N2653 char8_t: New tests
This patch provides new tests for the core language and compiler dependent library changes proposed in WG14 N2653 [1] for C. Most of the tests are provided in both a positive (-fchar8_t) and negative (-fno-char8_t) form to ensure behaviors are appropriately present or absent in each mode. Tested on Linux x86_64. gcc/testsuite/ChangeLog: 2021-05-31 Tom Honermann * gcc.dg/atomic/stdatomic-lockfree-char8_t.c: New test. * gcc.dg/char8_t-init-string-literal-1.c: New test. * gcc.dg/char8_t-predefined-macros-1.c: New test. * gcc.dg/char8_t-predefined-macros-2.c: New test. * gcc.dg/char8_t-string-literal-1.c: New test. * gcc.dg/char8_t-string-literal-2.c: New test. Tom. [1]: WG14 N2653 "char8_t: A type for UTF-8 characters and strings (Revision 1)" http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm commit 900aa3507defd80339828e5791c215a28efd9fea Author: Tom Honermann Date: Sat Feb 13 10:02:41 2021 -0500 N2653 char8_t for C: New tests This change provides new tests for the core language and compiler dependent library changes proposed in WG14 N2653 for C. Some of the tests are provided in both a positive (-fchar8_t) and negative (-fno-char8_t) form to ensure behaviors are appropriately present or absent in each mode. diff --git a/gcc/testsuite/gcc.dg/atomic/stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/stdatomic-lockfree-char8_t.c new file mode 100644 index 000..bb9eae84e83 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/stdatomic-lockfree-char8_t.c @@ -0,0 +1,42 @@ +/* Test atomic_is_lock_free for char8_t. */ +/* { dg-do run } */ +/* { dg-options "-std=c11 -fchar8_t -pedantic-errors" } */ + +#include +#include + +extern void abort (void); + +_Atomic __CHAR8_TYPE__ ac8a; +atomic_char8_t ac8t; + +#define CHECK_TYPE(MACRO, V1, V2) \ + do \ +{ \ + int r1 = MACRO;\ + int r2 = atomic_is_lock_free (&V1); \ + int r3 = atomic_is_lock_free (&V2); \ + if (r1 != 0 && r1 != 1 && r1 != 2) \ + abort ();\ + if (r2 != 0 && r2 != 1) \ + abort ();\ + if (r3 != 0 && r3 != 1) \ + abort ();\ + if (r1 == 2 && r2 != 1) \ + abort ();\ + if (r1 == 2 && r3 != 1) \ + abort ();\ + if (r1 == 0 && r2 != 0) \ + abort ();\ + if (r1 == 0 && r3 != 0) \ + abort ();\ +} \ + while (0) + +int +main () +{ + CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/char8_t-init-string-literal-1.c b/gcc/testsuite/gcc.dg/char8_t-init-string-literal-1.c new file mode 100644 index 000..4d587e90a26 --- /dev/null +++ b/gcc/testsuite/gcc.dg/char8_t-init-string-literal-1.c @@ -0,0 +1,13 @@ +/* Test that char, signed char, and unsigned char arrays can still be + initialized by UTF-8 string literals if -fchar8_t is enabled. */ +/* { dg-do compile } */ +/* { dg-options "-fchar8_t" } */ + +char cbuf1[] = u8"text"; +char cbuf2[] = { u8"text" }; + +signed char scbuf1[] = u8"text"; +signed char scbuf2[] = { u8"text" }; + +unsigned char ucbuf1[] = u8"text"; +unsigned char ucbuf2[] = { u8"text" }; diff --git a/gcc/testsuite/gcc.dg/char8_t-predefined-macros-1.c b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-1.c new file mode 100644 index 000..884c634990d --- /dev/null +++ b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-1.c @@ -0,0 +1,16 @@ +// Test that char8_t related predefined macros are not present when -fchar8_t is +// not enabled. +// { dg-do compile } +// { dg-options "-fno-char8_t" } + +#if defined(_CHAR8_T_SOURCE) +# error _CHAR8_T_SOURCE is defined! +#endif + +#if defined(__CHAR8_TYPE__) +# error __CHAR8_TYPE__ is defined! +#endif + +#if defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE) +# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is defined! +#endif diff --git a/gcc/testsuite/gcc.dg/char8_t-predefined-macros-2.c b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-2.c new file mode 100644 index 000..7f425357f57 --- /dev/null +++ b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-2.c @@ -0,0 +1,16 @@ +// Test that char8_t related predefined macros are present when -fchar8_t is +// enabled. +// { dg-do compile } +// { dg-options "-fchar8_t" } + +#if !defined(_CHAR8_T_SOURCE) +# error _CHAR8_T_SOURCE is not defined! +#endif + +#if !defined(__CHAR8_TYPE__) +# error __CHAR8_TYPE__ is not defined! +#endif + +#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE) +# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined! +#endif diff --git a/gcc/testsuite/gcc.dg/char8_t-string-literal-1.c b/gcc/testsuite/gcc.dg/char8_t-string-literal-1.c new file mode 100644 index 000..df94582ac1d --- /dev/null +++ b/gcc/testsuite/gcc.dg/char8_t-string-literal-1.c @@ -0,0 +1,6 @@ +// Test tha
[PATCH 3/3]: C N2653 char8_t: Documentation updates
This patch updates documentation for the -fchar8_t and -fno-char8_t options to describe their effect on C code as proposed in WG14 N2653 [1]. Tested on Linux x86_64. 2021-05-31 Tom Honermann * doc/invoke.texi (-fchar8_t): update for char8_t support for C. Tom. [1]: WG14 N2653 "char8_t: A type for UTF-8 characters and strings (Revision 1)" http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm commit d3cb3c6648cc15fe1beea6c9799e044cb722148a Author: Tom Honermann Date: Sun May 30 16:57:09 2021 -0400 N2653 char8_t for C: Documentation updates This change updates documentation for the -fchar8_t option to describe its affect on C code as proposed in WG14 N2653 for C. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 5cd4e2d993c..ba4c60a6179 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -2884,14 +2884,27 @@ This flag is enabled by default for @option{-std=c++17}. @itemx -fno-char8_t @opindex fchar8_t @opindex fno-char8_t -Enable support for @code{char8_t} as adopted for C++20. This includes -the addition of a new @code{char8_t} fundamental type, changes to the -types of UTF-8 string and character literals, new signatures for -user-defined literals, associated standard library updates, and new -@code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature test macros. +Enable support for @code{char8_t} for C as proposed in N2653, and for +C++ as adopted for C++20. + +For C, this changes the type of UTF-8 string literals from array of +@code{char} to array of @code{unsigned char} and defines the +@code{_CHAR8_T_SOURCE} macro to inform the C standard library that the +@code{char8_t} typedef name and the @code{mbrtoc8} and @code{c8rtomb} +functions should be declared by @code{}, and that the +@code{atomic_char8_t} typedef name and the @code{ATOMIC_CHAR8_T_LOCK_FREE} +macro should be defined by @code{}. + +For C++, this enables the @code{char8_t} fundamental type, changes the +type of UTF-8 string literals from array of @code{char} to array of +@code{char8_t}, changes the type of character literals from @code{char} +to @code{char8_t}, adds additional @code{char8_t}-based signatures for +user-defined literals, enables associated standard library updates, and +defines the @code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature +test macros. This option enables functions to be overloaded for ordinary and UTF-8 -strings: +strings in C++: @smallexample int f(const char *);// #1
Re: [PATCH 0/3]: C N2653 char8_t implementation
On 6/7/21 5:03 PM, Joseph Myers wrote: On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote: These changes do not impact default gcc behavior. The existing -fchar8_t option is extended to C compilation to enable the N2653 changes, and -fno-char8_t is extended to explicitly disable them. N2653 has not yet been accepted by WG14, so no changes are made to handling of the C2X language dialect. Why is that option needed? Normally I'd expect features to be enabled or disabled based on the selected language version, rather than having separate options to adjust the configuration for one very specific feature in a language version. Adding extra language dialects not corresponding to any standard version but to some peculiar mix of versions (such as C17 with a changed type for u8"", or C2X with a changed type for u8'') needs a strong reason for those language dialects to be useful (for example, the -fgnu89-inline option was justified by widespread use of GNU-style extern inline in headers). The option is needed because it impacts core language backward compatibility (for both C and C++, the type of u8 string literals; for C++, the type of u8 character literals and the new char8_t fundamental type). The ability to opt-in or opt-out of the feature eases migration by enabling source code compatibility. C and C++ standards are not published at the same cadence. A project that targets C++20 and C17 may therefore have a need to either opt-out of char8_t support on the C++ side (already possible via -fno-char8_t), or to opt-in to char8_t support on the C side until such time as the targets change to C++20(+) and C23(+); assuming WG14 approval at some point. I think the whole patch series would best wait until after the proposal has been considered by a WG14 meeting, in addition to not increasing the number of language dialects supported. As an opt-in feature, this is useful to gain implementation and deployment experience for WG14. It would be appropriate to document this as an experimental feature pending WG14 approval. If WG14 declines it or approves it with different behavior, the feature can then be removed or changed. The option could also be introduced as -fexperimental-char8_t if that eases concerns, though I do not favor that approach due to misalignment with the existing option for C++. Tom.
Re: [PATCH 1/3]: C N2653 char8_t: Language support
On 6/7/21 5:11 PM, Joseph Myers wrote: On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote: When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE macro is predefined. This is the mechanism proposed to glibc to opt-in to declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions proposed in N2653. See [2]. I don't think glibc should have such a feature test macro, and I don't think GCC should define such feature test macros either - _*_SOURCE macros are generally for the *user* to define to decide what namespace they want visible, not for the compiler to define. Without proliferating new language dialects, __STDC_VERSION__ ought to be sufficient to communicate from the compiler to the library (including to GCC's own headers such as stdatomic.h). In general I agree, but I think an exception is warranted in this case for a few reasons: 1. The feature includes both core language changes (the change of type for u8 string literals) and library changes. The library changes are not actually dependent on the core language change, but they are intended to be used together. 2. Existing use of the char8_t identifier can be found in existing open source projects and likely exists in some closed source projects as well. An opt-in approach avoids conflict and the need to conditionalize code based on gcc version. 3. An opt-in approach enables evaluation of the feature prior to any WG14 approval. Tom.
Re: [PATCH 1/3]: C N2653 char8_t: Language support
On 6/7/21 5:12 PM, Joseph Myers wrote: Also, it seems odd to add a new field to cpp_options without any code in libcpp that uses the value of that field. Ah, thank you. That appears to be leftover code from prior experimentation and I failed to identify it as such when preparing the patch. I'll provide a revised patch. Tom.
Re: [PATCH 1/3]: C N2653 char8_t: Language support
On 6/11/21 12:01 PM, Jakub Jelinek wrote: On Fri, Jun 11, 2021 at 11:52:41AM -0400, Tom Honermann via Gcc-patches wrote: On 6/7/21 5:11 PM, Joseph Myers wrote: On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote: When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE macro is predefined. This is the mechanism proposed to glibc to opt-in to declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions proposed in N2653. See [2]. I don't think glibc should have such a feature test macro, and I don't think GCC should define such feature test macros either - _*_SOURCE macros are generally for the *user* to define to decide what namespace they want visible, not for the compiler to define. Without proliferating new language dialects, __STDC_VERSION__ ought to be sufficient to communicate from the compiler to the library (including to GCC's own headers such as stdatomic.h). In general I agree, but I think an exception is warranted in this case for a few reasons: 1. The feature includes both core language changes (the change of type for u8 string literals) and library changes. The library changes are not actually dependent on the core language change, but they are intended to be used together. 2. Existing use of the char8_t identifier can be found in existing open source projects and likely exists in some closed source projects as well. An opt-in approach avoids conflict and the need to conditionalize code based on gcc version. 3. An opt-in approach enables evaluation of the feature prior to any WG14 approval. But calling it _CHAR8_T_SOURCE is weird and inconsistent with everything else. In C++, there is __cpp_char8_t 201811L predefined macro for char8_t. Using that in C is not right, sure. Often we use __SIZEOF_type__ macros not just for sizeof(), but also for presence check of the types, like #ifdef __SIZEOF_INT128__ __int128 i; #else long long i; #endif etc., while char8_t has sizeof (char8_t) == 1, perhaps predefining __SIZEOF_CHAR8_T__ 1 instead of _CHAR8_T_SOURCE would be better? I'm open to whatever signaling mechanism would be preferred. It took me a while to settle on _CHAR8_T_SOURCE as the mechanism to propose as I didn't find much for other precedents. I agree that having _CHAR8_T_SOURCE be implied by the -fchar8_t option is unusual with respect to other feature test macros. Is that what you find to be weird and inconsistent? Predefining __SIZEOF_CHAR8_T__ would be consistent with __SIZEOF_WCHAR_T__, but kind of strange too since the size is always 1. Perhaps a better approach would be to follow the __CHAR16_TYPE__ and __CHAR32_TYPE__ precedent and define __CHAR8_TYPE__ to unsigned char. That is likewise a bit strange since the type would always be unsigned char, but it does provide a bit more symmetry. That could potentially have some use as well; for C++, it could be defined as char8_t and thereby reflect the difference between the two languages. Perhaps it could be useful in the future as well if WG14 were to add distinct char8_t, char16_t, and char32_t types as C++ did (I'm not offering any prediction regarding the likelihood of that happening). Tom. Jakub
Re: [PATCH 1/3]: C N2653 char8_t: Language support
On 6/11/21 12:53 PM, Jakub Jelinek wrote: On Fri, Jun 11, 2021 at 12:20:48PM -0400, Tom Honermann wrote: I'm open to whatever signaling mechanism would be preferred. It took me a while to settle on _CHAR8_T_SOURCE as the mechanism to propose as I didn't find much for other precedents. I agree that having _CHAR8_T_SOURCE be implied by the -fchar8_t option is unusual with respect to other feature test macros. Is that what you find to be weird and inconsistent? Predefining __SIZEOF_CHAR8_T__ would be consistent with __SIZEOF_WCHAR_T__, but kind of strange too since the size is always 1. Perhaps a better approach would be to follow the __CHAR16_TYPE__ and __CHAR32_TYPE__ precedent and define __CHAR8_TYPE__ to unsigned char. That is likewise a bit strange since the type would always be unsigned char, but it does provide a bit more symmetry. That could potentially have some use as well; for C++, it could be defined as char8_t and thereby reflect the difference between the two languages. Perhaps it could be useful in the future as well if WG14 were to add distinct char8_t, char16_t, and char32_t types as C++ did (I'm not offering any prediction regarding the likelihood of that happening). C++ already predefines #define __CHAR8_TYPE__ unsigned char #define __CHAR16_TYPE__ short unsigned int #define __CHAR32_TYPE__ unsigned int for -std={c,gnu}++2{0,a,3,b} or -fchar8_t (unless -fno-char8_t), so I agree just making sure __CHAR8_TYPE__ is defined to unsigned char even for C is best. And you probably don't need to do anything in the C patch for it, void c_stddef_cpp_builtins(void) { builtin_define_with_value ("__SIZE_TYPE__", SIZE_TYPE, 0); ... if (flag_char8_t) builtin_define_with_value ("__CHAR8_TYPE__", CHAR8_TYPE, 0); builtin_define_with_value ("__CHAR16_TYPE__", CHAR16_TYPE, 0); builtin_define_with_value ("__CHAR32_TYPE__", CHAR32_TYPE, 0); will do that. Thank you; I had forgotten that I had already done that work. I confirmed that the proposed changes result in __CHAR8_TYPE__ being defined (the tests included with the patch already enforced it). Tom. Jakub
Re: [PATCH 0/3]: C N2653 char8_t implementation
On 6/11/21 1:27 PM, Joseph Myers wrote: On Fri, 11 Jun 2021, Tom Honermann via Gcc-patches wrote: The option is needed because it impacts core language backward compatibility (for both C and C++, the type of u8 string literals; for C++, the type of u8 character literals and the new char8_t fundamental type). Lots of new features in new standard versions can affect backward compatibility. We generally bundle all of those up into a single -std option rather than having an explosion of different language variants with different features enabled or disabled. I don't think this feature, for C, reaches the threshold that would justify having a separate option to control it, especially given that people can use -Wno-pointer-sign or pointer casts or their own local char8_t typedef as an intermediate step if they want code using u8"" strings to work for both old and new standard versions. Ok, I'm happy to defer to your experience. My perspective is likely biased by the C++20 changes being more disruptive for that language. I don't think u8"" strings are widely used in C library headers in a way where the choice of type matters. (Use of a feature in library headers is a key thing that can justify options such as -fgnu89-inline, because it means the choice of language version is no longer fully under control of a single project.) That aligns with my expectations. The only feature proposed for C2x that I think is likely to have significant compatibility implications in practice for a lot of code is making bool, true and false into keywords. I still don't think a separate option makes sense there. (If that feature is accepted for C2x, what would be useful is for people to do distribution rebuilds with -std=gnu2x as the default to find and fix code that breaks, in advance of the default actually changing in GCC. But the workaround for not-yet-fixed code would be -std=gnu11, not a separate option for that one feature.) Ok, that comparison is helpful. I think the whole patch series would best wait until after the proposal has been considered by a WG14 meeting, in addition to not increasing the number of language dialects supported. As an opt-in feature, this is useful to gain implementation and deployment experience for WG14. I think this feature is one of the cases where experience in C++ is sufficiently relevant for C (although there are certainly cases of other language features where the languages are sufficiently different that using C++ experience like that can be problematic). E.g. we didn't need -fdigit-separators for C before digit separators were added to C2x, and we don't need -fno-digit-separators now they are in C2x (the feature is just enabled or disabled based on the language version), although that's one of many features that do affect compatibility in corner cases. Got it, thanks again, that comparison is helpful. Per this and prior messages, I'll revise the gcc patch series as follows (I'll likewise revise the glibc changes, but will detail that in the corresponding glibc mailing list thread). 1. Remove the proposed use of -fchar8_t and -fno-char8_t for C code. 2. Remove the updated documentation for the -fchar8_t option since it won't be applicable to C code. 3. Remove the _CHAR8_T_SOURCE macro. 4. Enable the change of u8 string literal type based on -std=[gnu|c]2x (by setting flag_char8_t if flag_isoc2x is set). 5. Condition the declarations of atomic_char8_t and __GCC_ATOMIC_CHAR8_T_LOCK_FREE on _GNU_SOURCE or _ISOC2X_SOURCE. 6. Remove the char8 data member from cpp_options that I had added and forgot to remove. 7. Revise the tests and rename them for consistency with other C2x tests. If I've forgotten anything, please let me know. Thank you for the thorough review! Tom.
[PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library
This patch completes implementation of the C++20 proposal P0482R6 [1] by adding declarations of std::c8rtomb() and std::mbrtoc8() in if provided by the C library in . This patch addresses feedback provided in response to a previous patch submission [2]. Autoconf changes determine if the C library declares c8rtomb and mbrtoc8 at global scope when uchar.h is included and compiled with either -fchar8_t or -std=c++20. New _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros reflect the probe results. The header declares these functions in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T configuration macro is defined (by default it is defined if the C++20 __cpp_char8_t feature test macro is defined) Patches to glibc to implement c8rtomb and mbrtoc8 have been submitted [3]. New tests validate the presence of these declarations. The tests pass trivially if the C library does not provide these functions. Otherwise they ensure that the functions are declared when is included and either -fchar8_t or -std=c++20 is enabled. Tested on Linux x86_64. libstdc++-v3/ChangeLog: 2022-01-07 Tom Honermann * acinclude.m4 Define config macros if uchar.h provides c8rtomb() and mbrtoc8(). * config.h.in: Re-generate. * configure: Re-generate. * include/c_compatibility/uchar.h: Declare ::c8rtomb and ::mbrtoc8. * include/c_global/cuchar: Declare std::c8rtomb and std::mbrtoc8. * include/c_std/cuchar: Declare std::c8rtomb and std::mbrtoc8. * testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc: New test. * testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc: New test. Tom. [1]: WG21 P0482R6 "char8_t: A type for UTF-8 characters and strings (Revision 6)" https://wg21.link/p0482r6 [2]: [PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library https://gcc.gnu.org/pipermail/libstdc++/2021-June/052685.html [3]: "C++20 P0482R6 and C2X N2653" [Patch 0/3]: https://sourceware.org/pipermail/libc-alpha/2022-January/135061.html [Patch 1/3]: https://sourceware.org/pipermail/libc-alpha/2022-January/135062.html [Patch 2/3]: https://sourceware.org/pipermail/libc-alpha/2022-January/135063.html [Patch 3/3]: https://sourceware.org/pipermail/libc-alpha/2022-January/135064.html Tom. commit 3d40bc9bf5c79343ea5a6cc355539542f4b56c9b Author: Tom Honermann Date: Sat Jan 1 17:26:31 2022 -0500 P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library. This change completes implementation of the C++20 proposal P0482R6 by adding declarations of std::c8rtomb() and std::mbrtoc8() if provided by the C library. Autoconf changes determine if the C library declares c8rtomb and mbrtoc8 at global scope when uchar.h is included and compiled with either -fchar8_t or -std=c++20 enabled; new _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros are defined accordingly. The header declares these functions in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T configuration macro is defined (by default it is defined if the C++20 __cpp_char8_t feature test macro is defined). New tests validate the presence of these declarations. The tests pass trivially if the C library does not provide these functions. Otherwise they ensure that the functions are declared when is included and either -fchar8_t or -std=c++20 is enabled. diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 index 635168d7e25..85235005c7e 100644 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -2039,6 +2039,50 @@ AC_DEFUN([GLIBCXX_CHECK_UCHAR_H], [ namespace std in .]) fi + CXXFLAGS="$CXXFLAGS -fchar8_t" + if test x"$ac_has_uchar_h" = x"yes"; then +AC_MSG_CHECKING([for c8rtomb and mbrtoc8 in with -fchar8_t]) +AC_TRY_COMPILE([#include + namespace test + { + using ::c8rtomb; + using ::mbrtoc8; + } + ], + [], [ac_uchar_c8rtomb_mbrtoc8_fchar8_t=yes], + [ac_uchar_c8rtomb_mbrtoc8_fchar8_t=no]) + else +ac_uchar_c8rtomb_mbrtoc8_fchar8_t=no + fi + AC_MSG_RESULT($ac_uchar_c8rtomb_mbrtoc8_fchar8_t) + if test x"$ac_uchar_c8rtomb_mbrtoc8_fchar8_t" = x"yes"; then +AC_DEFINE(_GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T, 1, + [Define if c8rtomb and mbrtoc8 functions in should be + imported into namespace std in for -fchar8_t.]) + fi + + CXXFLAGS="$CXXFLAGS -std=c++20" + if test x"$ac_has_uchar_h" = x"yes"; then +AC_MSG_CHECKING([for c8rtomb and mbrtoc8 in with -std=c++20]) +AC_TRY_
[PATCH 0/2]: C N2653 char8_t implementation
This series of patches implements the core language features for the WG14 N2653 [1] proposal to provide char8_t support in C. These changes are intended to align char8_t support in C with the support provided in C++20 via WG21 P0482R6 [2]. These patches addresses feedback provided in response to a previous submission [3][4]. These changes do not impact default gcc behavior. Per prior feedback by Joseph Myers, the existing -fchar8_t and -fno-char8_t options used to opt-in to or opt-out of char8_t support in C++ are NOT reused for C. Instead, the C related core language changes are enabled when targeting C2x. Note that N2653 has not yet been accepted by WG14 for C2x, but the patches enable these changes for C2x in order to avoid an additional language dialect flag (e.g., -fchar8_t). Patch 1: Language support Patch 2: New tests Tom. [1]: WG14 N2653 "char8_t: A type for UTF-8 characters and strings (Revision 1)" http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm [2]: WG21 P0482R6 "char8_t: A type for UTF-8 characters and strings (Revision 6)" https://wg21.link/p0482r6 [3]: [PATCH 0/3]: C N2653 char8_t implementation https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572022.html [4]: [PATCH 1/3]: C N2653 char8_t: Language support https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572023.html
[PATCH 1/2]: C N2653 char8_t: Language support
This patch implements the core language and compiler dependent library changes proposed in WG14 N2653 [1] for C2x. The changes include: - Change of type for UTF-8 string literals from array of char to array of char8_t (unsigned char) when targeting C2x. - A new atomic_char8_t typedef. - A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro. Tested on Linux x86_64. gcc/ChangeLog: 2022-01-07 Tom Honermann * ginclude/stdatomic.h (atomic_char8_t, ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro. gcc/c/ChangeLog: 2022-01-07 Tom Honermann * c-parser.c (c_parser_string_literal): Use char8_t as the type of CPP_UTF8STRING when char8_t support is enabled. * c-typeck.c (digest_init): Allow initialization of an array of character type by a string literal with type array of char8_t. gcc/c-family/ChangeLog: 2022-01-07 Tom Honermann * c-lex.c (lex_string, lex_charconst): Use char8_t as the type of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is enabled. * c-opts.c (c_common_post_options): Set flag_char8_t if targeting C2x. Tom. [1]: WG14 N2653 "char8_t: A type for UTF-8 characters and strings (Revision 1)" http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm commit c041cce5d262908349be3f1f2e361c824db15845 Author: Tom Honermann Date: Sat Jan 1 18:10:41 2022 -0500 N2653 char8_t for C: Language support This patch implements the core language and compiler dependent library changes proposed in WG14 N2653 for C2X. The changes include: - Change of type for UTF-8 string literals from array of const char to array of const char8_t (unsigned char). - A new atomic_char8_t typedef. - A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro. diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c index 2651331e683..0b3debbb9bd 100644 --- a/gcc/c-family/c-lex.c +++ b/gcc/c-family/c-lex.c @@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate) default: case CPP_STRING: case CPP_UTF8STRING: - value = build_string (1, ""); + if (type == CPP_UTF8STRING && flag_char8_t) + { + value = build_string (TYPE_PRECISION (char8_type_node) +/ TYPE_PRECISION (char_type_node), +""); /* char8_t is 8 bits */ + } + else + value = build_string (1, ""); break; case CPP_STRING16: value = build_string (TYPE_PRECISION (char16_type_node) @@ -1425,10 +1432,10 @@ lex_charconst (const cpp_token *token) type = char16_type_node; else if (token->type == CPP_UTF8CHAR) { - if (!c_dialect_cxx ()) - type = unsigned_char_type_node; - else if (flag_char8_t) + if (flag_char8_t) type = char8_type_node; + else if (!c_dialect_cxx ()) + type = unsigned_char_type_node; else type = char_type_node; } diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index 4c20e44f5b5..bd96e1319ad 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -1060,9 +1060,9 @@ c_common_post_options (const char **pfilename) if (flag_sized_deallocation == -1) flag_sized_deallocation = (cxx_dialect >= cxx14); - /* char8_t support is new in C++20. */ + /* char8_t support is implicitly enabled in C++20 and C2x. */ if (flag_char8_t == -1) -flag_char8_t = (cxx_dialect >= cxx20); +flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x; if (flag_extern_tls_init) { diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index b09ad307acd..4239633e295 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -7439,7 +7439,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok) default: case CPP_STRING: case CPP_UTF8STRING: - value = build_string (1, ""); + if (type == CPP_UTF8STRING && flag_char8_t) + { + value = build_string (TYPE_PRECISION (char8_type_node) +/ TYPE_PRECISION (char_type_node), +""); /* char8_t is 8 bits */ + } + else + value = build_string (1, ""); break; case CPP_STRING16: value = build_string (TYPE_PRECISION (char16_type_node) @@ -7464,9 +7471,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok) { default: case CPP_STRING: -case CPP_UTF8STRING: TREE_TYPE (value) = char_array_type_node; break; +case CPP_UTF8STRING: + if (flag_char8_t) + TREE_TYPE (value) = char8_array_type_node; + else + TREE_TYPE (value) = char_array_type_node; + break; case CPP_STRING16: TREE_TYPE (value) = char16_array_type_node; break; diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 78a6c68aaa6..b4eeea545a9 100644 --- a/gcc/c/c
[PATCH 2/2]: C N2653 char8_t: New tests
This patch provides new tests for the core language and compiler dependent library changes proposed in WG14 N2653 [1] for C2x. Tested on Linux x86_64. gcc/testsuite/ChangeLog: 2021-05-31 Tom Honermann * gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test. * gcc.dg/c2x-predefined-macros.c: New test. * gcc.dg/c2x-utf8str-type.c: New test. * gcc.dg/c2x-utf8str.c: New test. * gcc.dg/gnu2x-predefined-macros.c: New test. * gcc.dg/gnu2x-utf8str-type.c: New test. * gcc.dg/gnu2x-utf8str.c: New test. Tom. [1]: WG14 N2653 "char8_t: A type for UTF-8 characters and strings (Revision 1)" http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm commit f4eee2bf403b62714d1ccb4542b8c85dc552a411 Author: Tom Honermann Date: Sun Jan 2 00:26:17 2022 -0500 N2653 char8_t for C: New tests This change provides new tests for the core language and compiler dependent library changes proposed in WG14 N2653 for C. diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..37ea4c8926c --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,42 @@ +/* Test atomic_is_lock_free for char8_t. */ +/* { dg-do run } */ +/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */ + +#include +#include + +extern void abort (void); + +_Atomic __CHAR8_TYPE__ ac8a; +atomic_char8_t ac8t; + +#define CHECK_TYPE(MACRO, V1, V2) \ + do \ +{ \ + int r1 = MACRO;\ + int r2 = atomic_is_lock_free (&V1); \ + int r3 = atomic_is_lock_free (&V2); \ + if (r1 != 0 && r1 != 1 && r1 != 2) \ + abort ();\ + if (r2 != 0 && r2 != 1) \ + abort ();\ + if (r3 != 0 && r3 != 1) \ + abort ();\ + if (r1 == 2 && r2 != 1) \ + abort ();\ + if (r1 == 2 && r3 != 1) \ + abort ();\ + if (r1 == 0 && r2 != 0) \ + abort ();\ + if (r1 == 0 && r3 != 0) \ + abort ();\ +} \ + while (0) + +int +main () +{ + CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c new file mode 100644 index 000..a017b134817 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c @@ -0,0 +1,5 @@ +/* Test atomic_is_lock_free for char8_t with -std=gnu2x. */ +/* { dg-do run } */ +/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */ + +#include "c2x-stdatomic-lockfree-char8_t.c" diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c new file mode 100644 index 000..c88e51b54c5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c @@ -0,0 +1,11 @@ +/* Test C2x predefined macros. */ +/* { dg-do compile } */ +/* { dg-options "-std=c2x" } */ + +#if !defined(__CHAR8_TYPE__) +# error __CHAR8_TYPE__ is not defined! +#endif + +#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE) +# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined! +#endif diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str-type.c b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c new file mode 100644 index 000..76559c0b19b --- /dev/null +++ b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c @@ -0,0 +1,6 @@ +/* Test C2x UTF-8 string literal type. */ +/* { dg-do compile } */ +/* { dg-options "-std=c2x" } */ + +_Static_assert (_Generic (u8"text", char*: 1, unsigned char*: 2) == 2, "UTF-8 string literals have an unexpected type"); +_Static_assert (_Generic (u8"x"[0], char: 1, unsigned char: 2) == 2, "UTF-8 string literal elements have an unexpected type"); diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str.c b/gcc/testsuite/gcc.dg/c2x-utf8str.c new file mode 100644 index 000..712482c6569 --- /dev/null +++ b/gcc/testsuite/gcc.dg/c2x-utf8str.c @@ -0,0 +1,34 @@ +/* Test initialization by UTF-8 string literal in C2x. */ +/* { dg-do compile } */ +/* { dg-require-effective-target wchar } */ +/* { dg-options "-std=c2x" } */ + +typedef __CHAR8_TYPE__ char8_t; +typedef __CHAR16_TYPE__ char16_t; +typedef __CHAR32_TYPE__ char32_t; +typedef __WCHAR_TYPE__ wchar_t; + +/* Test that char, signed char, unsigned char, and char8_t arrays can be + initialized by a UTF-8 string literal. */ +const char cbuf1[] = u8"text"; +const char cbuf2[] = { u8"text" }; +const signed char scbuf1[] = u8"text"; +const signed char scbuf2[] = { u8"text" }; +const unsigned char ucbuf1[] = u8"text"; +const unsigned char ucbuf2[] = { u8"text" }; +const char8_t
Re: [PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library
On 1/10/22 8:23 AM, Jonathan Wakely wrote: On Sat, 8 Jan 2022 at 00:42, Tom Honermann via Libstdc++ mailto:libstdc%2b...@gcc.gnu.org>> wrote: This patch completes implementation of the C++20 proposal P0482R6 [1] by adding declarations of std::c8rtomb() and std::mbrtoc8() in if provided by the C library in . This patch addresses feedback provided in response to a previous patch submission [2]. Autoconf changes determine if the C library declares c8rtomb and mbrtoc8 at global scope when uchar.h is included and compiled with either -fchar8_t or -std=c++20. New _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros reflect the probe results. The header declares these functions in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T configuration macro is defined (by default it is defined if the C++20 __cpp_char8_t feature test macro is defined) Patches to glibc to implement c8rtomb and mbrtoc8 have been submitted [3]. New tests validate the presence of these declarations. The tests pass trivially if the C library does not provide these functions. Otherwise they ensure that the functions are declared when is included and either -fchar8_t or -std=c++20 is enabled. Tested on Linux x86_64. libstdc++-v3/ChangeLog: 2022-01-07 Tom Honermann mailto:t...@honermann.net>> * acinclude.m4 Define config macros if uchar.h provides c8rtomb() and mbrtoc8(). * config.h.in <http://config.h.in>: Re-generate. * configure: Re-generate. * include/c_compatibility/uchar.h: Declare ::c8rtomb and ::mbrtoc8. * include/c_global/cuchar: Declare std::c8rtomb and std::mbrtoc8. * include/c_std/cuchar: Declare std::c8rtomb and std::mbrtoc8. * testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc: New test. * testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc: New test. Thanks, Tom, this looks good and I'll get it committed for GCC 12. Thank you! My only concern is that the new tests depend on an internal macro: +#if _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 + using std::mbrtoc8; + using std::c8rtomb; I prefer if tests are written as "user code" when possible, and not using our internal macros. That isn't always possible, and in this case would require adding new effective-target keyword to testsuite/lib/libstdc++.exp just for use in these two tests. I don't think we should bother with that. I went with this approach solely due to my unfamiliarity with the test system. I knew there should be a way to conditionally make the test "pass" as unsupported or as an expected failure, but didn't know how to go about implementing that. I don't mind following up with an additional patch if such a change is desirable. I took a look at testsuite/lib/libstdc++.exp and it looks like it may be pretty straight forward to add effective-target support. It would probably be a good learning experience for me. I'll prototype and report back. I suppose strictly speaking we should not define __cpp_lib_char8_t unless these two functions are present in libc. But I'm not sure we want to change that now either. All of libstdc++, libc++, and MS STL have been defining __cpp_lib_char8_t despite the absence of these functions, so yeah, I don't think we want to change that. Tom.
Re: [PATCH 0/2]: C N2653 char8_t implementation
On 1/10/22 9:23 PM, Joseph Myers wrote: Please repost these patches after GCC 12 branches (updated as appropriate depending on whether the feature is accepted at the two-week Jan/Feb WG14 meeting, which doesn't yet have an agenda), since we're currently stabilizing for the release and so not considering new features. Thank you, Joseph. Will do! Tom.
Re: [PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library
On 1/10/22 4:38 PM, Jonathan Wakely wrote: On Mon, 10 Jan 2022 at 21:24, Tom Honermann via Libstdc++ wrote: On 1/10/22 8:23 AM, Jonathan Wakely wrote: On Sat, 8 Jan 2022 at 00:42, Tom Honermann via Libstdc++ mailto:libstdc%2b...@gcc.gnu.org>> wrote: This patch completes implementation of the C++20 proposal P0482R6 [1] by adding declarations of std::c8rtomb() and std::mbrtoc8() in if provided by the C library in . This patch addresses feedback provided in response to a previous patch submission [2]. Autoconf changes determine if the C library declares c8rtomb and mbrtoc8 at global scope when uchar.h is included and compiled with either -fchar8_t or -std=c++20. New _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros reflect the probe results. The header declares these functions in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T configuration macro is defined (by default it is defined if the C++20 __cpp_char8_t feature test macro is defined) Patches to glibc to implement c8rtomb and mbrtoc8 have been submitted [3]. New tests validate the presence of these declarations. The tests pass trivially if the C library does not provide these functions. Otherwise they ensure that the functions are declared when is included and either -fchar8_t or -std=c++20 is enabled. Tested on Linux x86_64. libstdc++-v3/ChangeLog: 2022-01-07 Tom Honermann mailto:t...@honermann.net>> * acinclude.m4 Define config macros if uchar.h provides c8rtomb() and mbrtoc8(). * config.h.in <http://config.h.in>: Re-generate. * configure: Re-generate. * include/c_compatibility/uchar.h: Declare ::c8rtomb and ::mbrtoc8. * include/c_global/cuchar: Declare std::c8rtomb and std::mbrtoc8. * include/c_std/cuchar: Declare std::c8rtomb and std::mbrtoc8. * testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc: New test. * testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc: New test. Thanks, Tom, this looks good and I'll get it committed for GCC 12. Thank you! My only concern is that the new tests depend on an internal macro: +#if _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 + using std::mbrtoc8; + using std::c8rtomb; I prefer if tests are written as "user code" when possible, and not using our internal macros. That isn't always possible, and in this case would require adding new effective-target keyword to testsuite/lib/libstdc++.exp just for use in these two tests. I don't think we should bother with that. I went with this approach solely due to my unfamiliarity with the test system. I knew there should be a way to conditionally make the test "pass" as unsupported or as an expected failure, but didn't know how to go about implementing that. I don't mind following up with an additional patch if such a change is desirable. I took a look at testsuite/lib/libstdc++.exp and it looks like it may be pretty straight forward to add effective-target support. It would probably be a good learning experience for me. I'll prototype and report back. Yes, it's very easy to do. Take a look at the check_effective_target_blah procs in that file, especially the later ones that use v3_check_preprocessor_condition. You can use that to define an effective target keyword for any preprocessor condition (such as the new macros you're adding). Then the test can do: // { dg-do compile { target blah } } which will make it UNSUPPORTED if the effective target proc doesn't return true. See https://gcc.gnu.org/onlinedocs/gccint/Selectors.html#Selectors for the docs on target selectors. I'm just not sure it's worth adding a new keyword for just two tests. Thank you for the implementation direction; this was quite easy! Patch attached (to be applied after the original one). libstdc++-v3/ChangeLog: 2022-01-11 Tom Honermann * testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc: Modify to use new c8rtomb_mbrtoc8_cxx20 effective target. * testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc: Modify to use new c8rtomb_mbrtoc8_fchar8_t effective target. * testsuite/lib/libstdc++.exp: Add new effective targets. If you decide that the new keywords aren't worth adding, no worries; my feelings won't be hurt :) Tom. commit 0542361fe8cb5da146097f86ca8ea8bca86421e0 Author: Tom Honermann Date: Tue Jan 11 14:57:51 2022 -0500 Add effective target support for tests of C++20 c8rtomb and mbrtoc8. diff --git a/libstdc++-v3/testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc b/libstdc++-v3