Re: [PATCH 2/9]: C++ P0482R5 char8_t: Core language support

2018-12-04 Thread Tom Honermann

On 12/3/18 5:01 PM, Jason Merrill wrote:

On 12/3/18 4:51 PM, Jason Merrill wrote:

On 11/5/18 2:39 PM, Tom Honermann wrote:
This patch adds support for the P0482R5 core language changes.  This 
includes:

- The -fchar8_t and -fno_char8_t command line options.
- char8_t as a keyword.
- The char8_t builtin type as a non-aliasing unsigned integral
   character type of size 1.
- Use of char8_t as a simple type specifier.
- u8 character literals with type char8_t.
- u8 string literals with type array of const char8_t.
- User defined literal operators that accept char8_1 and char8_t 
pointer

   types.
- New __cpp_char8_t predefined feature test macro.
- New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined
   macros .
- Name mangling and demangling for char8_t (using Du).

gcc/ChangeLog:

2018-11-04  Tom Honermann  

  * defaults.h: Define CHAR8_TYPE.

gcc/c-family/ChangeLog:

2018-11-04  Tom Honermann  
  * c-family/c-common.c (c_common_reswords): Add char8_t.
  (fix_string_type): Use char8_t for the type of u8 string 
literals.

  (c_common_get_alias_set): char8_t doesn't alias.
  (c_common_nodes_and_builtins): Define char8_t as a builtin 
type in

  C++.
  (c_stddef_cpp_builtins): Add __CHAR8_TYPE__.
  (keyword_begins_type_specifier): Add RID_CHAR8.
  * gcc/c-family/c-common.h (rid): Add RID_CHAR8.
  (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE.
  Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS.
  Define char8_type_node and char8_array_type_node.
  * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE.
  (c_cpp_builtins): Predefine __cpp_char8_t.
  * c-family/c-lex.c (lex_string): Use char8_array_type_node as the
  type of CPP_UTF8STRING.
  (lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR.
  * c-family/c.opt: Add the -fchar8_t command line option.

gcc/c/ChangeLog:

2018-11-04  Tom Honermann  

  * c/c-typeck.c (char_type_p): Add char8_type_node.
  (digest_init): Handle initialization by a u8 string literal of
  char8_t type.

gcc/cp/ChangeLog:

2018-11-04  Tom Honermann  

  * cp/cvt.c (type_promotes_to): Handle char8_t promotion.
  * cp/decl.c (grokdeclarator): Handle invalid type specifier
  combinations involving char8_t.
  * cp/lex.c (init_reswords): Add char8_t as a reserved word.
  * cp/mangle.c (write_builtin_type): Add name mangling for char8_t
  (Du).
  * cp/parser.c (cp_keyword_starts_decl_specifier_p,
  cp_parser_simple_type_specifier): Recognize char8_t as a simple
  type specifier.
  (cp_parser_string_literal): Use char8_array_type_node for the 
type

  of CPP_UTF8STRING.
  (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in 
system

  headers.
  * cp/rtti.c (emit_support_tinfos): type_info support for char8_t.
  * cp/tree.c (char_type_p): Recognize char8_t as a character type.
  * cp/typeck.c (string_conv_p): Handle conversions of u8 string
  literals of char8_t type.
  (check_literal_operator_args): Handle UDLs with u8 string 
literals

  of char8_t type.
  * cp/typeck2.c (digest_init_r): Disallow initializing a char 
array

  with a u8 string literal.

libiberty/ChangeLog:

2018-10-31  Tom Honermann  
  * cp-demangle.c (cplus_demangle_builtin_types,
  cplus_demangle_type): Add name demangling for char8_t (Du).
  * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate the
  new char8_t type.



@@ -3543,6 +3556,10 @@ c_common_get_alias_set (tree t)
   if (!TYPE_P (t))
 return -1;



+  /* Unlike char, char8_t doesn't alias. */
+  if (flag_char8_t && t == char8_type_node)
+    return -1;


This seems unnecessary; doesn't the existing code have the same 
effect? I think we could do with just an adjustment to the existing 
comment.
I'm not sure.  I had concerns about unintended matching due to char8_t 
having an underlying type of unsigned char.



+  else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node)
+  || (flag_char8_t && type == char8_type_node)
+  bool char8_array = (flag_char8_t && !!comptypes (typ1, 
char8_type_node));

+   || (flag_char8_t && type == char8_type_node
In many places you check the flag and then for one of the char8 
types. Since the types won't be used without the flag, checking the 
flag seems redundant?


This was again protection against unintended matching of the underlying 
unsigned char type, particularly when compiling as C. char8_type_node is 
constructed (in c_common_nodes_and_builtins) following the pattern in 
place for char16_t and char32_t with the following code:


+  char8_type_node = get_identifier (CHAR8_TYPE);
+  char8_type_node = TREE_TYPE (identifier_global_value (char8_type_node));
+  char8_type_size = TYPE_PRECISION (char8_type_node);
+  if (c_dialect_cxx ())
+{
+  ch

Re: [PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates

2018-12-04 Thread Tom Honermann

On 12/3/18 2:59 PM, Jason Merrill wrote:

On 11/5/18 2:39 PM, Tom Honermann wrote:
This patch adds documentation for new -fchar8_t and -fno-char8_t 
options.


gcc/ChangeLog:

2018-11-04  Tom Honermann  
  * doc/invoke.texi (-fchar8_t): Document new option.



+Enable support for the P0482 proposal including the addition of a
+new @code{char8_t} fundamental type, changes to the types of UTF-8


Now that the proposal has been accepted, I'd refer to C++2a instead.


Agreed.  I also need to make the changes to implicitly enable -fchar8_t 
with -std=c++2a.


The list of impacted standard library features was incomplete and I 
suspect it isn't worth mentioning them specifically.  Perhaps mentioning 
the feature test macros would be helpful as well?


How does the following sound?

Enable support for @code{char8_t} as adopted for C++2a.  This includes 
the addition of a new @code{char8_t} fundamental type, changes to the 
types of UTF-8 string and character literals, new signatures for user 
defined literals, associated standard library updates, and new 
@code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature test macros.


Tom.



Jason





Re: [PATCH 2/9]: C++ P0482R5 char8_t: Core language support

2018-12-17 Thread Tom Honermann

On 12/17/18 4:02 PM, Jason Merrill wrote:

On 12/5/18 11:16 AM, Jason Merrill wrote:

On 12/5/18 2:09 AM, Tom Honermann wrote:

On 12/3/18 5:01 PM, Jason Merrill wrote:

On 12/3/18 4:51 PM, Jason Merrill wrote:

On 11/5/18 2:39 PM, Tom Honermann wrote:
This patch adds support for the P0482R5 core language changes.  
This includes:

- The -fchar8_t and -fno_char8_t command line options.
- char8_t as a keyword.
- The char8_t builtin type as a non-aliasing unsigned integral
   character type of size 1.
- Use of char8_t as a simple type specifier.
- u8 character literals with type char8_t.
- u8 string literals with type array of const char8_t.
- User defined literal operators that accept char8_1 and char8_t 
pointer

   types.
- New __cpp_char8_t predefined feature test macro.
- New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined
   macros .
- Name mangling and demangling for char8_t (using Du).

gcc/ChangeLog:

2018-11-04  Tom Honermann  

  * defaults.h: Define CHAR8_TYPE.

gcc/c-family/ChangeLog:

2018-11-04  Tom Honermann  
  * c-family/c-common.c (c_common_reswords): Add char8_t.
  (fix_string_type): Use char8_t for the type of u8 string 
literals.

  (c_common_get_alias_set): char8_t doesn't alias.
  (c_common_nodes_and_builtins): Define char8_t as a builtin 
type in

  C++.
  (c_stddef_cpp_builtins): Add __CHAR8_TYPE__.
  (keyword_begins_type_specifier): Add RID_CHAR8.
  * gcc/c-family/c-common.h (rid): Add RID_CHAR8.
  (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE.
  Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS.
  Define char8_type_node and char8_array_type_node.
  * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE.
  (c_cpp_builtins): Predefine __cpp_char8_t.
  * c-family/c-lex.c (lex_string): Use char8_array_type_node 
as the

  type of CPP_UTF8STRING.
  (lex_charconst): Use char8_type_node as the type of 
CPP_UTF8CHAR.

  * c-family/c.opt: Add the -fchar8_t command line option.

gcc/c/ChangeLog:

2018-11-04  Tom Honermann  

  * c/c-typeck.c (char_type_p): Add char8_type_node.
  (digest_init): Handle initialization by a u8 string literal of
  char8_t type.

gcc/cp/ChangeLog:

2018-11-04  Tom Honermann  

  * cp/cvt.c (type_promotes_to): Handle char8_t promotion.
  * cp/decl.c (grokdeclarator): Handle invalid type specifier
  combinations involving char8_t.
  * cp/lex.c (init_reswords): Add char8_t as a reserved word.
  * cp/mangle.c (write_builtin_type): Add name mangling for 
char8_t

  (Du).
  * cp/parser.c (cp_keyword_starts_decl_specifier_p,
  cp_parser_simple_type_specifier): Recognize char8_t as a 
simple

  type specifier.
  (cp_parser_string_literal): Use char8_array_type_node for 
the type

  of CPP_UTF8STRING.
  (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs 
in system

  headers.
  * cp/rtti.c (emit_support_tinfos): type_info support for 
char8_t.
  * cp/tree.c (char_type_p): Recognize char8_t as a character 
type.

  * cp/typeck.c (string_conv_p): Handle conversions of u8 string
  literals of char8_t type.
  (check_literal_operator_args): Handle UDLs with u8 string 
literals

  of char8_t type.
  * cp/typeck2.c (digest_init_r): Disallow initializing a 
char array

  with a u8 string literal.

libiberty/ChangeLog:

2018-10-31  Tom Honermann  
  * cp-demangle.c (cplus_demangle_builtin_types,
  cplus_demangle_type): Add name demangling for char8_t (Du).
  * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to 
accommodate the

  new char8_t type.



@@ -3543,6 +3556,10 @@ c_common_get_alias_set (tree t)
   if (!TYPE_P (t))
 return -1;



+  /* Unlike char, char8_t doesn't alias. */
+  if (flag_char8_t && t == char8_type_node)
+    return -1;


This seems unnecessary; doesn't the existing code have the same 
effect? I think we could do with just an adjustment to the 
existing comment.
I'm not sure.  I had concerns about unintended matching due to 
char8_t having an underlying type of unsigned char.


That shouldn't be a problem: if char8_t is a distinct type, it won't 
match unsigned char, and if it's the same as unsigned char, 
flag_char8_t will be false.


+  else if (flag_char8_t && TREE_TYPE (value) == 
char8_array_type_node)

+  || (flag_char8_t && type == char8_type_node)
+  bool char8_array = (flag_char8_t && !!comptypes (typ1, 
char8_type_node));

+   || (flag_char8_t && type == char8_type_node
In many places you check the flag and then for one of the char8 
types. Since the types won't be used without the flag, checking 
the flag seems redundant?


This was again protection against unintended matching of the 
underlying unsigned char type, particularly when compiling as C. 
char8_type_node is constructed (in c_common_nodes_

Re: [REVISED PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates

2018-12-23 Thread Tom Honermann
Attached is a revised patch that addresses feedback provided by Jason 
and Sandra.  Changes from the prior patch include:

- Updates to the -fchar8_t option documentation as requested by Jason.
- Corrections for indentation, spacing, hyphenation, and wrapping as
  requested by Sandra.

Tested on x86_64-linux.

gcc/ChangeLog:

2018-11-04  Tom Honermann  
 * doc/invoke.texi (-fchar8_t): Document new option.

Tom.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 57491f1033c..95374951d98 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -206,7 +206,7 @@ in the following sections.
 @item C++ Language Options
 @xref{C++ Dialect Options,,Options Controlling C++ Dialect}.
 @gccoptlist{-fabi-version=@var{n}  -fno-access-control @gol
--faligned-new=@var{n}  -fargs-in-order=@var{n}  -fcheck-new @gol
+-faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new @gol
 -fconstexpr-depth=@var{n}  -fconstexpr-loop-limit=@var{n} @gol
 -fno-elide-constructors @gol
 -fno-enforce-eh-specs @gol
@@ -2432,6 +2432,60 @@ but few users will need to override the default of
 
 This flag is enabled by default for @option{-std=c++17}.
 
+@item -fchar8_t
+@itemx -fno-char8_t
+@opindex fchar8_t
+@opindex fno-char8_t
+Enable support for @code{char8_t} as adopted for C++2a.  This includes
+the addition of a new @code{char8_t} fundamental type, changes to the
+types of UTF-8 string and character literals, new signatures for
+user-defined literals, associated standard library updates, and new
+@code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature test macros.
+
+This option enables functions to be overloaded for ordinary and UTF-8
+strings:
+
+@smallexample
+int f(const char *);// #1
+int f(const char8_t *); // #2
+int v1 = f("text"); // Calls #1
+int v2 = f(u8"text");   // Calls #2
+@end smallexample
+
+@noindent
+and introduces new signatures for user-defined literals:
+
+@smallexample
+int operator""_udl1(char8_t);
+int v3 = u8'x'_udl1;
+int operator""_udl2(const char8_t*, std::size_t);
+int v4 = u8"text"_udl2;
+template int operator""_udl3();
+int v5 = u8"text"_udl3;
+@end smallexample
+
+@noindent
+The change to the types of UTF-8 string and character literals introduces
+incompatibilities with ISO C++11 and later standards.  For example, the
+following code is well-formed under ISO C++11, but is ill-formed when
+@option{-fchar8_t} is specified.
+
+@smallexample
+char ca[] = u8"xx"; // error: char-array initialized from wide
+//string
+const char *cp = u8"xx";// error: invalid conversion from
+//`const char8_t*' to `const char*'
+int f(const char*);
+auto v = f(u8"xx"); // error: invalid conversion from
+//`const char8_t*' to `const char*'
+std::string s@{u8"xx"@};  // error: no matching function for call to
+//`std::basic_string::basic_string()'
+using namespace std::literals;
+s = u8"xx"s;// error: conversion from
+//`basic_string' to non-scalar
+//type `basic_string' requested
+@end smallexample
+
 @item -fcheck-new
 @opindex fcheck-new
 Check that the pointer returned by @code{operator new} is non-null


Re: [REVISED PATCH 2/9]: C++ P0482R5 char8_t: Core language support

2018-12-23 Thread Tom Honermann
Attached is a revised patch that addresses changes in P0482R6 as well as 
feedback provided by Jason.  Changes from the prior patch include:

- Updated the value of the __cpp_char8_t feature test macro to 201811
  per P0482R6.
- Enable char8_t support with -std=c++2a per adoption of P0482R6 in
  San Diego.
- Reverted the unnecessary changes to gcc/gcc/c/c-typeck.c as requested
  by Jason.
- Removed unnecessary checks of 'flag_char8_t' within the C++ front
  end as requested by Jason.
- Corrected the regression spotted by Jason regarding initialization of
  signed char and unsigned char arrays with string literals.
- Made minor changes to the error message emitted for ill-formed
  initialization of char arrays with UTF-8 string literals.  These
  changes do not yet implement Jason's suggestion; I'll follow up with a
  separate patch for that due to additional test impact.

Tested on x86_64-linux.

gcc/ChangeLog:

2018-11-04  Tom Honermann  

 * defaults.h: Define CHAR8_TYPE.

gcc/c-family/ChangeLog:

2018-11-04  Tom Honermann  
 * c-family/c-common.c (c_common_reswords): Add char8_t.
 (fix_string_type): Use char8_t for the type of u8 string literals.
 (c_common_get_alias_set): char8_t doesn't alias.
 (c_common_nodes_and_builtins): Define char8_t as a builtin type in
 C++.
 (c_stddef_cpp_builtins): Add __CHAR8_TYPE__.
 (keyword_begins_type_specifier): Add RID_CHAR8.
 * c-family/c-common.h (rid): Add RID_CHAR8.
 (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE.
 Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS.
 Define char8_type_node and char8_array_type_node.
 * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine
 __GCC_ATOMIC_CHAR8_T_LOCK_FREE.
 (c_cpp_builtins): Predefine __cpp_char8_t.
 * c-family/c-lex.c (lex_string): Use char8_array_type_node as the
 type of CPP_UTF8STRING.
 (lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR.
 * c-family/c-opts.c: If not otherwise specified, enable -fchar8_t
 when targeting C++2a.
 * c-family/c.opt: Add the -fchar8_t command line option.

gcc/cp/ChangeLog:

2018-11-04  Tom Honermann  

 * cp/cvt.c (type_promotes_to): Handle char8_t promotion.
 * cp/decl.c (grokdeclarator): Handle invalid type specifier
 combinations involving char8_t.
 * cp/lex.c (init_reswords): Add char8_t as a reserved word.
 * cp/mangle.c (write_builtin_type): Add name mangling for char8_t
 (Du).
 * cp/parser.c (cp_keyword_starts_decl_specifier_p,
 cp_parser_simple_type_specifier): Recognize char8_t as a simple
 type specifier.
 (cp_parser_string_literal): Use char8_array_type_node for the type
 of CPP_UTF8STRING.
 (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in system
 headers.
 * cp/rtti.c (emit_support_tinfos): type_info support for char8_t.
 * cp/tree.c (char_type_p): Recognize char8_t as a character type.
 * cp/typeck.c (string_conv_p): Handle conversions of u8 string
 literals of char8_t type.
 (check_literal_operator_args): Handle UDLs with u8 string literals
 of char8_t type.
 * cp/typeck2.c (digest_init_r): Disallow initializing a char array
 with a u8 string literal.

libiberty/ChangeLog:

2018-10-31  Tom Honermann  
 * cp-demangle.c (cplus_demangle_builtin_types,
 cplus_demangle_type): Add name demangling for char8_t (Du).
 * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate the
 new char8_t type.

Tom.


diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index f10cf89c3a7..b387daca137 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -79,6 +79,7 @@ machine_mode c_default_pointer_mode = VOIDmode;
 	tree signed_char_type_node;
 	tree wchar_type_node;
 
+	tree char8_type_node;
 	tree char16_type_node;
 	tree char32_type_node;
 
@@ -128,6 +129,11 @@ machine_mode c_default_pointer_mode = VOIDmode;
 
 	tree wchar_array_type_node;
 
+   Type `char8_t[SOMENUMBER]' or something like it.
+   Used when a UTF-8 string literal is created.
+
+	tree char8_array_type_node;
+
Type `char16_t[SOMENUMBER]' or something like it.
Used when a UTF-16 string literal is created.
 
@@ -450,6 +456,7 @@ const struct c_common_resword c_common_reswords[] =
   { "case",		RID_CASE,	0 },
   { "catch",		RID_CATCH,	D_CXX_OBJC | D_CXXWARN },
   { "char",		RID_CHAR,	0 },
+  { "char8_t",		RID_CHAR8,	D_CXX_CHAR8_T_FLAGS | D_CXXWARN },
   { "char16_t",		RID_CHAR16,	D_CXXONLY | D_CXX11 | D_CXXWARN },
   { "char32_t",		RID_CHAR32,	D_CXXONLY | D_CXX11 | D_CXXWARN },
   { "class",		RID_CLASS,	D_CXX_OBJC | D_CXXWARN },
@@ -746,6 +753,11 @@ fix_string_type (tree value)
   nchars = length;
   e_type = char_type_node;
 }
+  else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node)
+{
+  nchars = length / (TYPE_P

Re: [REVISED PATCH 3/9]: C++ P0482R5 char8_t: New core language tests

2018-12-23 Thread Tom Honermann
Attached is a revised patch that addresses changes in P0482R6 as well as 
feedback provided by Jason for patch 2/9.  Changes from the prior patch 
include:

- New tests to ensure -fchar8_t is implicitly enabled when targeting
  C++2a per adoption of P0482R6 in San Diego.
  - gcc/testsuite/g++.dg/cpp2a/char8_t1.C
  - gcc/testsuite/g++.dg/cpp2a/char8_t2.C
- Updated the value of the __cpp_char8_t feature test macro to 201811
  per P0482R6.
- Updated tests to exercise initialization of signed char and unsigned
  char arrays with ordinary and UTF-8 string literals.
  - gcc/testsuite/g++.dg/ext/char8_t-init-1.C
  - gcc/testsuite/g++.dg/ext/char8_t-init-2.C

Tested on x86_64-linux.

gcc/testsuite/ChangeLog:

2018-11-04  Tom Honermann  
 * g++.dg/cpp0x/udlit-implicit-conv-neg-char8_t.C: New test cloned
 from udlit-implicit-conv-neg.C.  Validates handling of ill-formed
 uses of char8_t based user defined literals.
 * g++.dg/cpp0x/udlit-resolve-char8_t.C: New test cloned from
 udlit-resolve.C.  Validates handling of well-formed uses of char8_t
 based user defined literals.
 * g++.dg/cpp2a/char8_t1.C: New test; validates char8_t support is
 implicitly enabled when targeting C++2a.
 * g++.dg/cpp2a/char8_t2.C: New test; validates char8_t support is
 disabled when -fno-char8_t is specified when targeting C++2a.
 * g++.dg/ext/char8_t-aliasing-1.C: New test; validates warnings
 for type punning with char8_t types.  Illustrates that char8_t does
 not alias.
 * g++.dg/ext/char8_t-char-literal-1.C: New test; validates u8
 character literals have type char if char8_t support is not
 enabled.
 * g++.dg/ext/char8_t-char-literal-2.C: New test; validates u8
 character literals have type char8_t if char8_t support is
 enabled.
 * g++.dg/ext/char8_t-deduction-1.C: New test; validates char is
 deduced for u8 character and string literals if char8_t support is
 not enabled.
 * g++.dg/ext/char8_t-deduction-2.C: New test; validates char8_t is
 deduced for u8 character and string literals if char8_t support is
 enabled.
 * g++.dg/ext/char8_t-feature-test-macro-1.C: New test; validates
 that the __cpp_char8_t feature test macro is not defined if char8_t
 support is not enabled.
 * g++.dg/ext/char8_t-feature-test-macro-2.C: New test; validates
 that the __cpp_char8_t feature test macro is defined with the
 correct value if char8_t support is enabled.
 * g++.dg/ext/char8_t-init-1.C: New test; validates initialization
 by u8 character and string literals when support for char8_t is not
 enabled.
 * g++.dg/ext/char8_t-init-2.C: New test; validates initialization
 by u8 character and string literals when support for char8_t is
 enabled.
 * g++.dg/ext/char8_t-keyword-1.C: New test; validates that char8_t
 is not a keyword if support for char8_t is not enabled.
 * g++.dg/ext/char8_t-keyword-2.C: New test; validates that char8_t
 is a keyword if support for char8_t is enabled.
 * g++.dg/ext/char8_t-limits-1.C: New test; validates that char8_t
 is unsigned and sufficiently large to store the required range of
 char8_t values.
 * g++.dg/ext/char8_t-overload-1.C: New test; validates overload
 resolution for u8 character and string literal arguments when
 support for char8_t is not enabled.
 * g++.dg/ext/char8_t-overload-2.C: New test; validates overload
 resolution for u8 character and string literal arguments when
 support for char8_t is enabled.
 * g++.dg/ext/char8_t-predefined-macros-1.C: New test; validates
 that the __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE
 predefined macros are not defined when support for char8_t is not
 enabled.
 * g++.dg/ext/char8_t-predefined-macros-2.C: New test; validates
 that the __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE
 predefined macros are defined when support for char8_t is enabled.
 * g++.dg/ext/char8_t-sizeof-1.C: New test; validates that the size
 of char8_t and u8 character literals is 1.
 * g++.dg/ext/char8_t-specialization-1.C: New test; validate
 template specialization for u8 character literal template
 arguments when support for char8_t is not enabled.
 * g++.dg/ext/char8_t-specialization-2.C: New test; validate
 template specialization for char8_t and u8 character literal
 template arguments when support for char8_t is enabled.
 * g++.dg/ext/char8_t-string-literal-1.C: New test; validate the
 type of u8 string literals when support for char8_t is not enabled.
 * g++.dg/ext/char8_t-string-literal-2.C: New test; validate the
 type of u8 string literals when support for char8_t is enabled.
 * g++.dg/ext/char8_t-type-specifier-1.C: New test; validate that
 char8_t is not recognized as a type specifier when support for
 char8_t is not enabled.
 * g++.dg/ext/char8_t-type-specifier-2.C: New test; validate

Re: [REVISED PATCH 4/9]: C++ P0482R5 char8_t: Updates to existing core language tests

2018-12-23 Thread Tom Honermann
Attached is a revised patch that addresses changes in P0482R6 and 
adoption of P0482R6 for C++20 in San Diego.  Changes from the prior 
patch include:

- Updated a test to validate the value of the __cpp_char8_t feature test
  macro when targeting C++2a.

Tested on x86_64-linux.

gcc/testsuite/ChangeLog:

2018-11-04  Tom Honermann  

 * c-c++-common/raw-string-13.c: Added test cases for u8 raw string
 literals.
 * c-c++-common/raw-string-15.c: Likewise.
 * g++.dg/cpp0x/constexpr-wstring2.C: Added test cases for u8
 literals.
 * g++.dg/cpp2a/feat-cxx2a.C: Added test cases for the __cpp_char8_t
 feature test macro.
 * g++.dg/ext/utf-array-short-wchar.C: Likewise.
 * g++.dg/ext/utf-array.C: Likewise.
 * g++.dg/ext/utf-cxx98.C: Likewise.
 * g++.dg/ext/utf-dflt.C: Likewise.
 * g++.dg/ext/utf-gnuxx98.C: Likewise.
 * gcc.dg/utf-array-short-wchar.c: Likewise.
 * gcc.dg/utf-array.c: Likewise.

Tom.


diff --git a/gcc/testsuite/c-c++-common/raw-string-13.c b/gcc/testsuite/c-c++-common/raw-string-13.c
index 1b37405cee9..fa11edaa7aa 100644
--- a/gcc/testsuite/c-c++-common/raw-string-13.c
+++ b/gcc/testsuite/c-c++-common/raw-string-13.c
@@ -62,6 +62,47 @@ const char s16[] = R"??(??)??";
 const char s17[] = R"?(?)??)?";
 const char s18[] = R"??(??)??)??)??";
 
+const char u800[] = u8R"??=??()??'??!??-\
+(a)#[{}]^|~";
+)??=??";
+const char u801[] = u8R"a(
+)\
+a"
+)a";
+const char u802[] = u8R"a(
+)a\
+"
+)a";
+const char u803[] = u8R"ab(
+)a\
+b"
+)ab";
+const char u804[] = u8R"a??/(x)a??/";
+const char u805[] = u8R"abcdefghijklmn??(abc)abcdefghijklmn??";
+const char u806[] = u8R"abcdefghijklm??/(abc)abcdefghijklm??/";
+const char u807[] = u8R"abc(??)\
+abc";)abc";
+const char u808[] = u8R"def(de)\
+def";)def";
+const char u809[] = u8R"a(??)\
+a"
+)a";
+const char u810[] = u8R"a(??)a\
+"
+)a";
+const char u811[] = u8R"ab(??)a\
+b"
+)ab";
+const char u812[] = u8R"a#(a#)a??=)a#";
+const char u813[] = u8R"a#(??)a??=??)a#";
+const char u814[] = u8R"??/(x)??/
+";)??/";
+const char u815[] = u8R"??/(??)??/
+";)??/";
+const char u816[] = u8R"??(??)??";
+const char u817[] = u8R"?(?)??)?";
+const char u818[] = u8R"??(??)??)??)??";
+
 const char16_t u00[] = uR"??=??()??'??!??-\
 (a)#[{}]^|~";
 )??=??";
@@ -211,6 +252,25 @@ main (void)
   TEST (s16, "??");
   TEST (s17, "?)??");
   TEST (s18, "??"")??"")??");
+  TEST (u800, u8"??""??"")??""'??""!??""-\\\n(a)#[{}]^|~\";\n");
+  TEST (u801, u8"\n)\\\na\"\n");
+  TEST (u802, u8"\n)a\\\n\"\n");
+  TEST (u803, u8"\n)a\\\nb\"\n");
+  TEST (u804, u8"x");
+  TEST (u805, u8"abc");
+  TEST (u806, u8"abc");
+  TEST (u807, u8"??"")\\\nabc\";");
+  TEST (u808, u8"de)\\\ndef\";");
+  TEST (u809, u8"??"")\\\na\"\n");
+  TEST (u810, u8"??"")a\\\n\"\n");
+  TEST (u811, u8"??"")a\\\nb\"\n");
+  TEST (u812, u8"a#)a??""=");
+  TEST (u813, u8"??"")a??""=??");
+  TEST (u814, u8"x)??""/\n\";");
+  TEST (u815, u8"??"")??""/\n\";");
+  TEST (u816, u8"??");
+  TEST (u817, u8"?)??");
+  TEST (u818, u8"??"")??"")??");
   TEST (u00, u"??""??"")??""'??""!??""-\\\n(a)#[{}]^|~\";\n");
   TEST (u01, u"\n)\\\na\"\n");
   TEST (u02, u"\n)a\\\n\"\n");
diff --git a/gcc/testsuite/c-c++-common/raw-string-15.c b/gcc/testsuite/c-c++-common/raw-string-15.c
index 9dfdaabd87d..1d101dc8393 100644
--- a/gcc/testsuite/c-c++-common/raw-string-15.c
+++ b/gcc/testsuite/c-c++-common/raw-string-15.c
@@ -62,6 +62,47 @@ const char s16[] = R"??(??)??";
 const char s17[] = R"?(?)??)?";
 const char s18[] = R"??(??)??)??)??";
 
+const char u800[] = u8R"??=??()??'??!??-\
+(a)#[{}]^|~";
+)??=??";
+const char u801[] = u8R"a(
+)\
+a"
+)a";
+const char u802[] = u8R"a(
+)a\
+"
+)a";
+const char u803[] = u8R"ab(
+)a\
+b"
+)ab";
+const char u804[] = u8R"a??/(x)a??/";
+const char u805[] = u8R"abcdefghijklmn??(abc)abcdefghijklmn??";
+const char u806[] = u8R"abcdefghijklm??/(abc)abcdefghijklm??/";
+const char u807[] = u8R"abc(??)\
+abc";)abc";
+const char u808[] = u8R"def(de)\
+de

Re: [REVISED PATCH 5/9]: C++ P0482R5 char8_t: Standard library support

2018-12-23 Thread Tom Honermann
Attached is a revised patch that addresses changes in P0482R6.  Changes 
from the prior patch include:

- Updated the value of the __cpp_char8_t feature test macro to 201811.

Tested on x86_64-linux.

gcc/cp/ChangeLog:

2018-11-04  Tom Honermann  

 * name-lookup.c (get_std_name_hint): Added u8string as a name hint.

libstdc++-v3/ChangeLog:

2018-11-04  Tom Honermann  

 * config/abi/pre/gnu-versioned-namespace.ver (CXXABI_2.0): Add
 typeinfo symbols for char8_t.
 * config/abi/pre/gnu.ver: Add CXXABI_1.3.12.
 (GLIBCXX_3.4.26): Add symbols for specializations of
 numeric_limits and codecvt that involve char8_t.
 (CXXABI_1.3.12): Add typeinfo symbols for char8_t.
 * include/bits/atomic_base.h: Add atomic_char8_t.
 * include/bits/basic_string.h: Add std::hash and
 operator""s(const char8_t*, size_t).
 * include/bits/c++config: Define _GLIBCXX_USE_CHAR8_T and
 __cpp_lib_char8_t.
 * include/bits/char_traits.h: Add char_traits.
 * include/bits/codecvt.h: Add
 codecvt,
 codecvt,
 codecvt_byname, and
 codecvt_byname.
 * include/bits/cpp_type_traits.h: Add __is_integer to
 recognize char8_t as an integral type.
 * include/bits/fs_path.h: (path::__is_encoded_char): Recognize
 char8_t.
 (path::u8string): Return std::u8string when char8_t support is
 enabled.
 (path::generic_u8string): Likewise.
 (path::_S_convert): Handle conversion from char8_t input.
 (path::_S_str_convert): Likewise.
 * include/bits/functional_hash.h: Add hash.
 * include/bits/locale_conv.h (__str_codecvt_out): Add overloads for
 char8_t.
 * include/bits/locale_facets.h (_GLIBCXX_NUM_UNICODE_FACETS): Bump
 for new char8_t specializations.
 * include/bits/localefwd.h: Add missing declarations of
 codecvt and
 codecvt.  Add char8_t declarations
 codecvt and
 codecvt.
 * include/bits/postypes.h: Add u8streampos
 * include/bits/stringfwd.h: Add declarations of
 char_traits and u8string.
 * include/c_global/cstddef: Add __byte_operand.
 * include/experimental/bits/fs_path.h (path::__is_encoded_char):
 Recognize char8_t.
 (path::u8string): Return std::u8string when char8_t support is
 enabled.
 (path::generic_u8string): Likewise.
 (path::_S_convert): Handle conversion from char8_t input.
 (path::_S_str_convert): Likewise.
 * include/experimental/string: Add u8string.
 * include/experimental/string_view: Add u8string_view,
 hash, and
 operator""sv(const char8_t*, size_t).
 * include/std/atomic: Add atomic and atomic_char8_t.
 * include/std/charconv (__is_int_to_chars_type): Recognize char8_t
 as a character type.
 * include/std/limits: Add numeric_limits.
 * include/std/string_view: Add u8string_view,
 hash, and
 operator""sv(const char8_t*, size_t).
 * include/std/type_traits: Add __is_integral_helper,
 __make_unsigned, and __make_signed.
 * libsupc++/atomic_lockfree_defines.h: Define
 ATOMIC_CHAR8_T_LOCK_FREE.
 * src/c++11/Makefile.am: Compile with -fchar8_t when compiling
 codecvt.cc and limits.cc so that char8_t specializations of
 numeric_limits and codecvt and emitted.
 * src/c++11/Makefile.in: Likewise.
 * src/c++11/codecvt.cc: Define members of
 codecvt,
 codecvt,
 codecvt_byname, and
 codecvt_byname.
 * src/c++11/limits.cc: Define members of
 numeric_limits.
 * src/c++98/Makefile.am: Compile with -fchar8_t when compiling
 locale_init.cc and localename.cc.
 * src/c++98/Makefile.in: Likewise.
 * src/c++98/locale_init.cc: Add initialization for the
 codecvt and
 codecvt facets.
 * src/c++98/localename.cc: Likewise.
 * testsuite/util/testsuite_abi.cc: Validate ABI bump.

Tom.


diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 08632c382b7..5f2f8e865ca 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -5543,6 +5543,7 @@ get_std_name_hint (const char *name)
 {"basic_string", "", cxx98},
 {"string", "", cxx98},
 {"wstring", "", cxx98},
+{"u8string", "", cxx2a},
 {"u16string", "", cxx11},
 {"u32string", "", cxx11},
 /* .  */
diff --git a/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver b/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver
index c448b813331..b26cf1dc8ac 100644
--- a/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver
+++ b/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver
@@ -301,6 +301,11 @@ CXXABI_2.0 {
 _ZTSN10__cxxabiv120__si_class_type_infoE;
 _ZTSN10__cxxabiv121__vmi_class_type_infoE;
 
+# typeinfo for char8_t
+_ZTIDu;
+_ZTIPDu;
+_ZTIPKDu;
+
 # typeinfo for char16_t and char32_t
 _ZTIDs;
 _ZTIPDs;
diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b

Re: [PATCH 2/9]: C++ P0482R5 char8_t: Core language support

2018-12-23 Thread Tom Honermann
Thanks, Jason.  I just sent a revised set of patches addressing most of 
your feedback with exceptions as described inline below.


On 12/17/18 4:47 PM, Tom Honermann wrote:

On 12/17/18 4:02 PM, Jason Merrill wrote:

On 12/5/18 11:16 AM, Jason Merrill wrote:

On 12/5/18 2:09 AM, Tom Honermann wrote:

On 12/3/18 5:01 PM, Jason Merrill wrote:

On 12/3/18 4:51 PM, Jason Merrill wrote:

On 11/5/18 2:39 PM, Tom Honermann wrote:
This patch adds support for the P0482R5 core language changes.  
This includes:

- The -fchar8_t and -fno_char8_t command line options.
- char8_t as a keyword.
- The char8_t builtin type as a non-aliasing unsigned integral
   character type of size 1.
- Use of char8_t as a simple type specifier.
- u8 character literals with type char8_t.
- u8 string literals with type array of const char8_t.
- User defined literal operators that accept char8_1 and char8_t 
pointer

   types.
- New __cpp_char8_t predefined feature test macro.
- New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined
   macros .
- Name mangling and demangling for char8_t (using Du).

gcc/ChangeLog:

2018-11-04  Tom Honermann  

  * defaults.h: Define CHAR8_TYPE.

gcc/c-family/ChangeLog:

2018-11-04  Tom Honermann  
  * c-family/c-common.c (c_common_reswords): Add char8_t.
  (fix_string_type): Use char8_t for the type of u8 string 
literals.

  (c_common_get_alias_set): char8_t doesn't alias.
  (c_common_nodes_and_builtins): Define char8_t as a builtin 
type in

  C++.
  (c_stddef_cpp_builtins): Add __CHAR8_TYPE__.
  (keyword_begins_type_specifier): Add RID_CHAR8.
  * gcc/c-family/c-common.h (rid): Add RID_CHAR8.
  (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE.
  Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS.
  Define char8_type_node and char8_array_type_node.
  * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE.
  (c_cpp_builtins): Predefine __cpp_char8_t.
  * c-family/c-lex.c (lex_string): Use char8_array_type_node 
as the

  type of CPP_UTF8STRING.
  (lex_charconst): Use char8_type_node as the type of 
CPP_UTF8CHAR.

  * c-family/c.opt: Add the -fchar8_t command line option.

gcc/c/ChangeLog:

2018-11-04  Tom Honermann  

  * c/c-typeck.c (char_type_p): Add char8_type_node.
  (digest_init): Handle initialization by a u8 string 
literal of

  char8_t type.

gcc/cp/ChangeLog:

2018-11-04  Tom Honermann  

  * cp/cvt.c (type_promotes_to): Handle char8_t promotion.
  * cp/decl.c (grokdeclarator): Handle invalid type specifier
  combinations involving char8_t.
  * cp/lex.c (init_reswords): Add char8_t as a reserved word.
  * cp/mangle.c (write_builtin_type): Add name mangling for 
char8_t

  (Du).
  * cp/parser.c (cp_keyword_starts_decl_specifier_p,
  cp_parser_simple_type_specifier): Recognize char8_t as a 
simple

  type specifier.
  (cp_parser_string_literal): Use char8_array_type_node for 
the type

  of CPP_UTF8STRING.
  (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs 
in system

  headers.
  * cp/rtti.c (emit_support_tinfos): type_info support for 
char8_t.
  * cp/tree.c (char_type_p): Recognize char8_t as a 
character type.
  * cp/typeck.c (string_conv_p): Handle conversions of u8 
string

  literals of char8_t type.
  (check_literal_operator_args): Handle UDLs with u8 string 
literals

  of char8_t type.
  * cp/typeck2.c (digest_init_r): Disallow initializing a 
char array

  with a u8 string literal.

libiberty/ChangeLog:

2018-10-31  Tom Honermann  
  * cp-demangle.c (cplus_demangle_builtin_types,
  cplus_demangle_type): Add name demangling for char8_t (Du).
  * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to 
accommodate the

  new char8_t type.



@@ -3543,6 +3556,10 @@ c_common_get_alias_set (tree t)
   if (!TYPE_P (t))
 return -1;



+  /* Unlike char, char8_t doesn't alias. */
+  if (flag_char8_t && t == char8_type_node)
+    return -1;


This seems unnecessary; doesn't the existing code have the same 
effect? I think we could do with just an adjustment to the 
existing comment.
I'm not sure.  I had concerns about unintended matching due to 
char8_t having an underlying type of unsigned char.


That shouldn't be a problem: if char8_t is a distinct type, it won't 
match unsigned char, and if it's the same as unsigned char, 
flag_char8_t will be false.
I tried removing this check and that resulted in test 
gcc/testsuite/g++.dg/ext/char8_t-aliasing-1.C (added in patch 3/9) 
failing.  It seems this change is needed.  If you believe that implies 
that something is wrong elsewhere, please let me know.


+  else if (flag_char8_t && TREE_TYPE (value) == 
char8_array_type_node)

+  || (flag_char8_t && type == char8_type_node)
+  bool char8_array = (flag_char8_t && !!com

Re: [REVISED PATCH 7/9]: C++ P0482R5 char8_t: New standard library tests

2018-12-23 Thread Tom Honermann
Attached is a revised patch that addresses changes in P0482R6.  Changes 
from the prior patch include:

- Updated the value of the __cpp_char8_t feature test macro to 201811.

Tested on x86_64-linux.

libstdc++-v3/ChangeLog:

2018-11-04  Tom Honermann  

 * testsuite/18_support/numeric_limits/char8_t.cc: New test cloned
 from char16_32_t.cc; validates numeric_limits.
 * testsuite/21_strings/basic_string/literals/types-char8_t.cc: New
 test cloned from types.cc; validates operator""s for char8_t
 returns u8string.
 * testsuite/21_strings/basic_string/literals/values-char8_t.cc: New
 test cloned from values.cc; validates construction and comparison
 of u8string values.
 * testsuite/21_strings/basic_string/requirements/
 /explicit_instantiation/char8_t/1.cc: New test cloned from
 char16_t/1.cc; validates explicit instantiation of
 basic_string.
 * testsuite/21_strings/basic_string_view/literals/types-char8_t.cc:
 New test cloned from types.cc; validates operator""sv for char8_t
 returns u8string_view.
 * testsuite/21_strings/basic_string_view/literals/
 values-char8_t.cc: New test cloned from values.cc; validates
 construction and comparison of u8string_view values.
 * testsuite/21_strings/basic_string_view/requirements/
 explicit_instantiation/char8_t/1.cc: New test cloned from
 char16_t/1.cc; validates explicit instantiation of
 basic_string_view.
 * testsuite/21_strings/char_traits/requirements/char8_t/65049.cc:
 New test cloned from char16_t/65049.cc; validates that
 char_traits is not vulnerable to the concerns in PR65049.
 * testsuite/21_strings/char_traits/requirements/char8_t/
 typedefs.cc: New test cloned from char16_t/typedefs.cc; validates
 that char_traits member typedefs are present and correct.
 * testsuite/21_strings/char_traits/requirements/
 explicit_instantiation/char8_t/1.cc: New test cloned from
 char16_t/1.cc; validates explicit instantiation of
 char_traits.
 * testsuite/22_locale/codecvt/char16_t-char8_t.cc: New test cloned
 from char16_t.cc: validates
 codecvt.
 * testsuite/22_locale/codecvt/char32_t-char8_t.cc: New test cloned
 from char32_t.cc: validates
 codecvt.
 * testsuite/22_locale/codecvt/utf8-char8_t.cc: New test cloned from
 utf8.cc; validates codecvt and
 codecvt.
 * testsuite/27_io/filesystem/path/native/string-char8_t.cc: New
 test cloned from string.cc; validates filesystem::path construction
 from char8_t input.
 * testsuite/experimental/feat-char8_t.cc: New test; validates that
 the __cpp_lib_char8_t feature test macro is defined with the
 correct value.
 * testsuite/experimental/filesystem/path/native/string-char8_t.cc:
 New test cloned from string.cc; validates filesystem::path
 construction from char8_t input.
 * testsuite/experimental/string_view/literals/types-char8_t.cc: New
 test cloned from types.cc; validates operator""sv for char8_t
 returns u8string_view.
 * testsuite/experimental/string_view/literals/values-char8_t.cc:
 New test cloned from values.cc; validates construction and
 comparison of u8string_view values.
 * testsuite/experimental/string_view/requirements/
 explicit_instantiation/char8_t/1.cc: New test cloned from
 char16_t/1.cc; validates explicit instantiation of
 basic_string_view.
 * testsuite/ext/char8_t/atomic-1.cc: New test; validates that
 ATOMIC_CHAR8_T_LOCK_FREE is not defined if char8_t support is not
 enabled.

Tom.


diff --git a/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc b/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc
new file mode 100644
index 000..346463d7244
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc
@@ -0,0 +1,71 @@
+// { dg-do run { target c++11 } }
+// { dg-require-cstdint "" }
+// { dg-options "-fchar8_t" }
+
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include 
+#include 
+#include 
+
+// Test specializations for char8_t.
+template
+  void
+  do_test()
+  {
+typedef std::numeric_limits char_type;
+typedef std::numeric_li

Re: [PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates

2018-12-23 Thread Tom Honermann
Thank you, Sandra!  I just sent a revised patch to the list that I 
believe addresses all of your comments.  Thanks for the suggestion to 
generate and check the pdf, that was helpful to ensure the changes 
rendered correctly.


Tom.

On 12/11/18 6:35 PM, Sandra Loosemore wrote:

On 11/5/18 12:39 PM, Tom Honermann wrote:
This patch adds documentation for new -fchar8_t and -fno-char8_t 
options.


gcc/ChangeLog:

2018-11-04  Tom Honermann  
  * doc/invoke.texi (-fchar8_t): Document new option.



My comments are all about nitpicky formatting things.


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 57491f1033c..cd3a2a715db 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -206,7 +206,7 @@ in the following sections.
 @item C++ Language Options
 @xref{C++ Dialect Options,,Options Controlling C++ Dialect}.
 @gccoptlist{-fabi-version=@var{n}  -fno-access-control @gol
--faligned-new=@var{n}  -fargs-in-order=@var{n}  -fcheck-new @gol
+-faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t 
-fcheck-new @gol


Please consistently use 2 spaces (not just 1) to separate options on 
the same line in a @gccoptlist environment.



 -fconstexpr-depth=@var{n} -fconstexpr-loop-limit=@var{n} @gol
 -fno-elide-constructors @gol
 -fno-enforce-eh-specs @gol
@@ -2432,6 +2432,53 @@ but few users will need to override the 
default of


 This flag is enabled by default for @option{-std=c++17}.

+@item -fchar8_t
+@itemx -fno-char8_t
+@opindex fchar8_t
+@opindex fno-char8_t
+Enable support for the P0482 proposal including the addition of a
+new @code{char8_t} fundamental type, changes to the types of UTF-8
+string and character literals, new signatures for user defined
+literals, and new specializations of standard library class templates
+@code{std::numeric_limits}, @code{std::char_traits},
+and @code{std::hash}.
+
+This option enables functions to be overloaded for ordinary and UTF-8
+strings:
+
+@smallexample
+int f(const char *);    // #1
+int f(const char8_t *); // #2
+int v1 = f("text"); // Calls #1
+int v2 = f(u8"text");   // Calls #2
+@end smallexample
+
+and introduces new signatures for user defined literals:


@noindent immediately before the continued sentence of the paragraph 
before the example.


Also please hyphenate "user-defined" here.


+
+@smallexample
+int operator""_udl1(char8_t);
+int v3 = u8'x'_udl1;
+int operator""_udl2(const char8_t*, std::size_t);
+int v4 = u8"text"_udl2;
+template int operator""_udl3();
+int v5 = u8"text"_udl3;
+@end smallexample
+
+The change to the types of UTF-8 string and character literals 
introduces

+incompatibilities with ISO C++11 and later standards.  For example, the
+following code is well-formed under ISO C++11, but is ill-formed when
+@option{-fchar8_t} is specified.
+
+@smallexample
+char ca[] = u8"text";   // error: char-array initialized from 
wide string
+const char *cp = u8"text";  // error: invalid conversion from 'const 
char8_t*' to 'const char*'

+int f(const char*);
+auto v = f(u8"text");   // error: invalid conversion from 'const 
char8_t*' to 'const char*'
+std::string s1@{u8"text"@};   // error: no matching function for 
call to 'std::basic_string::basic_string()'

+using namespace std::literals;
+std::string s2 = u8"text"s; // error: conversion from 
'basic_string' to non-scalar type 'basic_string' 
requested

+@end smallexample


The formatting of this code example is way too wide to fit on the page 
of the printed/PDF manual.  I suggest putting the comments on separate 
lines from the code and breaking them across multiple lines where 
necessary.  If you format the example for <80 columns it will probably 
fit, although you should check the PDF if at all possible.



+
 @item -fcheck-new
 @opindex fcheck-new
 Check that the pointer returned by @code{operator new} is non-null



-Sandra





PATCH: Updated error messages for ill-formed cases of array initialization by string literal

2018-12-27 Thread Tom Honermann
As requested by Jason in the review of the P0482 (char8_t) core language 
changes, this patch includes updates to the error messages emitted for 
ill-formed cases of array initialization with a string literal.  With 
these changes, error messages that previously looked something like these:


- "char-array initialized from wide string"
- "wide character array initialized from non-wide string"
- "wide character array initialized from incompatible wide string"

now look like:

- "cannot initialize array of type 'char' from a string literal with 
type array of 'short unsigned int'"
- "cannot initialize array of type 'short unsigned int' from a string 
literal with type array of 'char'"
- "cannot initialize array of type 'short unsigned int' from a string 
literal with type array of 'unsigned int'"


These changes affect both the C and C++ front ends.

These changes have dependencies on the (revised) set of patches 
submitted for P0482 (char8_t) and will not apply cleanly without them.


Tested on x86_64-linux.

gcc/c/ChangeLog:

2018-12-26  Tom Honermann  

 * c-typeck.c (digest_init): Revised the error message produced for
 ill-formed cases of array initialization with a string literal.

gcc/cp/ChangeLog:

2018-12-26  Tom Honermann  

 * typeck2.c (digest_init_r): Revised the error message produced for
 ill-formed cases of array initialization with a string literal.

gcc/testsuite/ChangeLog:

2018-12-26  Tom Honermann  

 * gcc/testsuite/g++.dg/ext/char8_t-init-2.C: Updated the expected
 error messages for ill-formed cases of array initialization with a
 string literal.
 * gcc/testsuite/g++.dg/ext/utf-array-short-wchar.C: Likewise.
 * gcc/testsuite/g++.dg/ext/utf-array.C: Likewise.
 * gcc/testsuite/g++.dg/ext/utf8-2.C: Likewise.
 * gcc/testsuite/gcc.dg/init-string-2.c: Likewise.
 * gcc/testsuite/gcc.dg/pr61096-1.c: Likewise.
 * gcc/testsuite/gcc.dg/utf-array-short-wchar.c: Likewise.
 * gcc/testsuite/gcc.dg/utf-array.c: Likewise.
 * gcc/testsuite/gcc.dg/utf8-2.c: Likewise.

Tom.

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 9d09b8d65fd..4d2129dff2f 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -7447,6 +7447,7 @@ digest_init (location_t init_loc, tree type, tree init, tree origtype,
 	{
 	  struct c_expr expr;
 	  tree typ2 = TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (inside_init)));
+	  bool incompat_string_cst = false;
 	  expr.value = inside_init;
 	  expr.original_code = (strict_string ? STRING_CST : ERROR_MARK);
 	  expr.original_type = NULL;
@@ -7464,27 +7465,22 @@ digest_init (location_t init_loc, tree type, tree init, tree origtype,
 	{
 	  if (typ2 != char_type_node)
 		{
-		  error_init (init_loc, "char-array initialized from wide "
-			  "string");
-		  return error_mark_node;
+		  incompat_string_cst = true;
 		}
 	}
-	  else
+	  else if (!comptypes(typ1, typ2))
 	{
-	  if (typ2 == char_type_node)
-		{
-		  error_init (init_loc, "wide character array initialized "
-			  "from non-wide string");
-		  return error_mark_node;
-		}
-	  else if (!comptypes(typ1, typ2))
-		{
-		  error_init (init_loc, "wide character array initialized "
-			  "from incompatible wide string");
-		  return error_mark_node;
-		}
+	  incompat_string_cst = true;
 	}
 
+  if (incompat_string_cst)
+{
+	  error_at (init_loc, "cannot initialize array of type %qT from "
+	"a string literal with type array of %qT",
+	typ1, typ2);
+	  return error_mark_node;
+}
+
 	  if (TYPE_DOMAIN (type) != NULL_TREE
 	  && TYPE_SIZE (type) != NULL_TREE
 	  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 782fd7f9cd5..ae3b53dc001 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1060,46 +1060,43 @@ digest_init_r (tree type, tree init, int nested, int flags,
 	  && TREE_CODE (init) == STRING_CST)
 	{
 	  tree char_type = TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (init)));
+	  bool incompat_string_cst = false;
 
-	  if (TYPE_PRECISION (typ1) == BITS_PER_UNIT)
+	  if (typ1 != char_type)
 	{
-	  if (typ1 != char8_type_node && char_type == char8_type_node)
+	  /* The array element type does not match the initializing string
+	 literal element type. This is only allowed when initializing
+	 an array of signed char or unsigned char.  */
+	  if (TYPE_PRECISION (typ1) == BITS_PER_UNIT)
 		{
-		  if (complain & tf_error)
-		error_at (loc, "char-array initialized from UTF-8 string");
-		  return error_mark_node;
-		}
-	  else if (typ1 == char8_type_node && char_type == char_type_node)
-		{
-		  if (complain

[PATCH]: Fix PR c++/88095, class template argument deduction for literal operator templates per P0732 for C++2a

2019-08-02 Thread Tom Honermann

This patch fixes PR c++/88095:
- Bug 88095 - class nontype template parameter UDL string literals 
doesn't accepts deduction placeholder

- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88095.

It also addresses a latent issue; literal operator templates with 
template parameter packs of literal class type were previously accepted. 
 The patch corrects this and adds a test (udlit-class-nttp-neg.C).


In the change to gcc/cp/parser.c, it is not clear to me whether the 
'TREE_CODE (TREE_TYPE (parm)) == TEMPLATE_TYPE_PARM' comparison is 
necessary; it might be that 'CLASS_PLACEHOLDER_TEMPLATE' suffices on its 
own.


If accepted, I'd like to request this change be applied to gcc 9 as it 
is needed for one of the char8_t remediation approaches documented in 
P1423, and may be helpful for existing code bases impacted by the 
char8_t changes adopted via P0482 for C++20.

- https://wg21.link/p1423#emulate

Tested on x86_64-linux.

Thanks to Jeff Snyder for providing an initial patch in the 88059 PR.

gcc/cp/ChangeLog:

2019-08-02  Tom Honermann  

* parser.c (cp_parser_template_declaration_after_parameters): 
Enable
class template argument deduction for non-type template 
parameters in

literal operator templates.

gcc/testsuite/ChangeLog:

2019-08-02  Tom Honermann  

PR c++/88095
* g++.dg/cpp2a/udlit-class-nttp-ctad.C: New test.
* g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C: New test.
* g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C: New test.
* g++.dg/cpp2a/udlit-class-nttp.C: New test.
* g++.dg/cpp2a/udlit-class-nttp-neg.C: New test.
* g++.dg/cpp2a/udlit-class-nttp-neg2.C: New test.

diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index c9091f523c5..a406bba41c5 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,9 @@
+2019-08-02  Tom Honermann  
+
+	* parser.c (cp_parser_template_declaration_after_parameters): Enable
+	class template argument deduction for non-type template parameters in
+	literal operator templates.
+
 2019-07-16  Jason Merrill  
 
 	* parser.c (make_location): Add overload taking cp_lexer* as last
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 1a5da1dd8e8..86f895e96a3 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -28105,7 +28105,10 @@ cp_parser_template_declaration_after_parameters (cp_parser* parser,
 	{
 	  tree parm_list = TREE_VEC_ELT (parameter_list, 0);
 	  tree parm = INNERMOST_TEMPLATE_PARMS (parm_list);
-	  if (CLASS_TYPE_P (TREE_TYPE (parm)))
+	  if ((CLASS_TYPE_P (TREE_TYPE (parm))
+	   || (TREE_CODE (TREE_TYPE (parm)) == TEMPLATE_TYPE_PARM
+		   && CLASS_PLACEHOLDER_TEMPLATE (TREE_TYPE (parm
+		  && !TEMPLATE_PARM_PARAMETER_PACK (DECL_INITIAL (parm)))
 		/* OK, C++20 string literal operator template.  We don't need
 		   to warn in lower dialects here because we will have already
 		   warned about the template parameter.  */;
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 0f47604da85..c8613deaae6 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,13 @@
+2019-08-02  Tom Honermann  
+
+	PR c++/88095
+	* g++.dg/cpp2a/udlit-class-nttp-ctad.C: New test.
+	* g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C: New test.
+	* g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C: New test.
+	* g++.dg/cpp2a/udlit-class-nttp.C: New test.
+	* g++.dg/cpp2a/udlit-class-nttp-neg.C: New test.
+	* g++.dg/cpp2a/udlit-class-nttp-neg2.C: New test.
+
 2019-07-18  Jan Hubicka  
 
 	* g++.dg/lto/alias-5_0.C: New testcase.
diff --git a/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C b/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C
new file mode 100644
index 000..437fa9b5ab8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C
@@ -0,0 +1,24 @@
+// PR c++/88095
+// Test class non-type template parameters for literal operator templates.
+// Validate handling of failed class template argument deduction.
+// { dg-do compile { target c++2a } }
+
+namespace std {
+using size_t = decltype(sizeof(int));
+}
+
+template 
+struct fixed_string {
+  constexpr static std::size_t length = N;
+  constexpr fixed_string(...) { }
+  // auto operator<=> (const fixed_string&) = default;
+};
+// Missing deduction guide.
+
+template 
+constexpr std::size_t operator"" _udl() {
+  return decltype(fs)::length;
+}
+
+static_assert("test"_udl == 5); // { dg-error "15:no matching function for call to" }
+// { dg-error "15:class template argument deduction failed" "" { target *-*-* } .-1 }
diff --git a/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C b/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C
new file mode 100644
index 000..89bb5d39d7d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C
@@ -0,0 +1,20 @@
+// PR c++/88095
+// T

Re: [PATCH]: Fix PR c++/88095, class template argument deduction for literal operator templates per P0732 for C++2a

2019-08-06 Thread Tom Honermann

On 8/5/19 3:05 PM, Jason Merrill wrote:

On 8/2/19 9:59 AM, Tom Honermann wrote:

This patch fixes PR c++/88095:
- Bug 88095 - class nontype template parameter UDL string literals 
doesn't accepts deduction placeholder

- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88095.

It also addresses a latent issue; literal operator templates with 
template parameter packs of literal class type were previously 
accepted.   The patch corrects this and adds a test 
(udlit-class-nttp-neg.C).


In the change to gcc/cp/parser.c, it is not clear to me whether the 
'TREE_CODE (TREE_TYPE (parm)) == TEMPLATE_TYPE_PARM' comparison is 
necessary; it might be that 'CLASS_PLACEHOLDER_TEMPLATE' suffices on 
its own.


template_placeholder_p would be a shorter way to write these, but I 
think even better would be to just change CLASS_TYPE_P to 
MAYBE_CLASS_TYPE_P.  I'll make that change and commit the patch, since 
it looks like you don't have commit access yet.
Thanks, and correct, I don't have commit access yet (and I'm not sure 
that I should! :) )


If accepted, I'd like to request this change be applied to gcc 9 as 
it is needed for one of the char8_t remediation approaches documented 
in P1423, and may be helpful for existing code bases impacted by the 
char8_t changes adopted via P0482 for C++20.

- https://wg21.link/p1423#emulate


Seems reasonable.  It may be too late to make 9.2 at this point, though.


Is there anything I can/should do to request inclusion?

Tom.



Jason





[PATCH 0/4]: C++ P1423R3 char8_t remediation implementation

2019-09-15 Thread Tom Honermann
This series of patches provides an implementation of the changes for C++ 
proposal P1423R3 [1].


These changes do not impact default libstdc++ behavior for C++17 and 
earlier; they are only active for C++2a or when the -fchar8_t option is 
specified.


Tested x86_64-linux.

Patch 1: Decouple constraints for u8path from path constructors.
Patch 2: Update __cpp_lib_char8_t feature test macro value, add deleted 
operators, update u8path.

Patch 3: Updates to existing tests.
Patch 4: New tests.

Tom.

[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html


[PATCH 1/4]: C++ P1423R3 char8_t remediation: Decouple constraints for u8path from path constructors

2019-09-15 Thread Tom Honermann
This patch moves helper classes and functions for std::filesystem::path 
out of the class definition to a detail namespace so that they are 
available to the implementations of std::filesystem::u8path.  Prior to 
this patch, the SFINAE constraints for those implementations were 
specified via delegation to the overloads of path constructors with a 
std::locale parameter; it just so happened that those overloads had the 
same constraints.  As of P1423R3, u8path and those overloads no longer 
have the same constraints, so this dependency must be broken.


This patch also updates the experimental implementation of the 
filesystem TS to add SFINAE constraints to its implementations of 
u8path.  These functions were previously unconstrained and marked with a 
TODO comment.


This patch does not provide any intentional behavioral changes other 
than the added constraints to the experimental filesystem TS 
implementation of u8path.


I recommend applying the patch and viewing the diff with white space 
ignored when reviewing; there will be many fewer differences this way.


Alternatives to this refactoring would have been to make the u8path 
overloads friends of class path, or to make the helpers public members. 
Both of those approaches struck me as less desirable than this approach, 
though this approach does require more code changes and will affect 
implementation detail portions of mangled names for path constructors 
and inline member functions (mostly function template specializations).


libstdc++-v3/ChangeLog:

2019-09-15  Tom Honermann  

 * include/bits/fs_path.h: Moved helper utilities out of
   std::filesystem::path into a detail namespace to make them
   available for use by u8path.
 * include/experimental/bits/fs_path.h: Moved helper utilities out
   of std::experimental::filesystem::v1::path into a detail
   namespace to make them available for use by u8path.

Tom.
diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h
index e1083acf30f..71354515403 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -59,103 +59,114 @@ namespace filesystem
 {
 _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
-  /** @addtogroup filesystem
+  class path;
+
+namespace __detail
+{
+  /** @addtogroup filesystem-detail
*  @{
*/
 
-  /// A filesystem path.
-  class path
-  {
-template
-  using __is_encoded_char = __is_one_of,
-	char,
+  template
+using __is_encoded_char = __is_one_of,
+	  char,
 #ifdef _GLIBCXX_USE_CHAR8_T
-	char8_t,
+	  char8_t,
 #endif
 #if _GLIBCXX_USE_WCHAR_T
-	wchar_t,
+	  wchar_t,
 #endif
-	char16_t, char32_t>;
+	  char16_t, char32_t>;
 
-template>
-  using __is_path_iter_src
-	= __and_<__is_encoded_char,
-		 std::is_base_of>;
+  template>
+using __is_path_iter_src
+  = __and_<__is_encoded_char,
+	   std::is_base_of>;
 
-template
-  static __is_path_iter_src<_Iter>
-  __is_path_src(_Iter, int);
-
-template
-  static __is_encoded_char<_CharT>
-  __is_path_src(const basic_string<_CharT, _Traits, _Alloc>&, int);
-
-template
-  static __is_encoded_char<_CharT>
-  __is_path_src(const basic_string_view<_CharT, _Traits>&, int);
+  template
+static __is_path_iter_src<_Iter>
+__is_path_src(_Iter, int);
 
-template
-  static std::false_type
-  __is_path_src(const _Unknown&, ...);
-
-template
-  struct __constructible_from;
-
-template
-  struct __constructible_from<_Iter, _Iter>
-  : __is_path_iter_src<_Iter>
-  { };
+  template
+static __is_encoded_char<_CharT>
+__is_path_src(const basic_string<_CharT, _Traits, _Alloc>&, int);
 
-template
-  struct __constructible_from<_Source, void>
-  : decltype(__is_path_src(std::declval<_Source>(), 0))
-  { };
+  template
+static __is_encoded_char<_CharT>
+__is_path_src(const basic_string_view<_CharT, _Traits>&, int);
 
-template
-  using _Path = typename
-	std::enable_if<__and_<__not_, path>>,
-			  __not_>>,
-			  __constructible_from<_Tp1, _Tp2>>::value,
-		   path>::type;
+  template
+static std::false_type
+__is_path_src(const _Unknown&, ...);
 
-template
-  static _Source
-  _S_range_begin(_Source __begin) { return __begin; }
+  template
+struct __constructible_from;
 
-struct __null_terminated { };
+  template
+struct __constructible_from<_Iter, _Iter>
+: __is_path_iter_src<_Iter>
+{ };
 
-template
-  static __null_terminated
-  _S_range_end(_Source) { return {}; }
+  template
+struct __constructible_from<_Source, void>
+: decltype(__is_path_src(std::declval<_Source>(), 0))
+{ };
 
-template
-  static const _CharT*
-  _S_range_beg

[PATCH 2/4]: C++ P1423R3 char8_t remediation: Update feature test macro, add deleted operators, update u8path

2019-09-15 Thread Tom Honermann
This patch increments the __cpp_lib_char8_t feature test macro, adds 
deleted operator<< overloads for basic_ostream, and modifies u8path to 
accept sequences of char8_t for both the C++17 implementation of 
std::filesystem, and the filesystem TS implementation.


The implementation mechanism used for u8path differs between the C++17 
and filesystem TS implementations.  The changes to the former take 
advantage of C++17 'if constexpr'.  The changes to the latter retain 
C++11 compatibility and rely on tag dispatching.


libstdc++-v3/ChangeLog:

2019-09-15  Tom Honermann  

 * libstdc++-v3/include/bits/c++config: Bumped the value of the
   __cpp_lib_char8_t feature test macro.
 * libstdc++-v3/include/bits/fs_path.h (u8path): Modified u8path to
   accept sequences of char8_t.
 * libstdc++-v3/include/experimental/bits/fs_path.h (u8path):
   Modified u8path to accept sequences of char8_t.
 * libstdc++-v3/include/std/ostream: Added deleted overloads of
   wchar_t, char8_t, char16_t, and char32_t for ordinary and wide
   formatted character and string inserters.

Tom.
diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config
index c8e099aaadd..5bcf32d95ef 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -620,7 +620,7 @@ namespace std
 # endif
 #endif
 #ifdef _GLIBCXX_USE_CHAR8_T
-# define __cpp_lib_char8_t 201811L
+# define __cpp_lib_char8_t 201907L
 #endif
 
 /* Define if __float128 is supported on this host. */
diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h
index 71354515403..f3f539412fc 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -153,9 +153,24 @@ namespace __detail
 
   template())),
-	   typename _Val = typename std::iterator_traits<_Iter>::value_type>
+	   typename _Val = typename std::iterator_traits<_Iter>::value_type,
+	   typename _UnqualVal = std::remove_const_t<_Val>>
 using __value_type_is_char
-  = std::enable_if_t, char>>;
+  = std::enable_if_t,
+			 _UnqualVal>;
+
+  template())),
+	   typename _Val = typename std::iterator_traits<_Iter>::value_type,
+	   typename _UnqualVal = std::remove_const_t<_Val>>
+using __value_type_is_char_or_char8_t
+  = std::enable_if_t<__or_v<
+			   std::is_same<_UnqualVal, char>
+#ifdef _GLIBCXX_USE_CHAR8_T
+			   ,std::is_same<_UnqualVal, char8_t>
+#endif
+			   >,
+			 _UnqualVal>;
 
   // @} group filesystem-detail
 } // namespace __detail
@@ -639,29 +654,41 @@ namespace __detail
   /// Create a path from a UTF-8-encoded sequence of char
   template,
-	   typename _Require2 = __detail::__value_type_is_char<_InputIterator>>
+	   typename _CharT =
+	 __detail::__value_type_is_char_or_char8_t<_InputIterator>>
 inline path
 u8path(_InputIterator __first, _InputIterator __last)
 {
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-  // XXX This assumes native wide encoding is UTF-16.
-  std::codecvt_utf8_utf16 __cvt;
-  path::string_type __tmp;
-  if constexpr (is_pointer_v<_InputIterator>)
+#ifdef _GLIBCXX_USE_CHAR8_T
+  if constexpr (is_same_v<_CharT, char8_t>)
 	{
-	  if (__str_codecvt_in_all(__first, __last, __tmp, __cvt))
-	return path{ __tmp };
+	  return path{ __first, __last };
 	}
   else
 	{
-	  const std::string __u8str{__first, __last};
-	  const char* const __ptr = __u8str.data();
-	  if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt))
-	return path{ __tmp };
+#endif
+	  // XXX This assumes native wide encoding is UTF-16.
+	  std::codecvt_utf8_utf16 __cvt;
+	  path::string_type __tmp;
+	  if constexpr (is_pointer_v<_InputIterator>)
+	{
+	  if (__str_codecvt_in_all(__first, __last, __tmp, __cvt))
+		return path{ __tmp };
+	}
+	  else
+	{
+	  const std::string __u8str{__first, __last};
+	  const char* const __ptr = __u8str.data();
+	  if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt))
+		return path{ __tmp };
+	}
+	  _GLIBCXX_THROW_OR_ABORT(filesystem_error(
+	  "Cannot convert character sequence",
+	  std::make_error_code(errc::illegal_byte_sequence)));
+#ifdef _GLIBCXX_USE_CHAR8_T
 	}
-  _GLIBCXX_THROW_OR_ABORT(filesystem_error(
-	"Cannot convert character sequence",
-	std::make_error_code(errc::illegal_byte_sequence)));
+#endif
 #else
   // This assumes native normal encoding is UTF-8.
   return path{ __first, __last };
@@ -671,21 +698,32 @@ namespace __detail
   /// Create a path from a UTF-8-encoded sequence of char
   template,
-	   typename _Require2 = __detail::__value_type_is_char<_Source>>
+	   typename _CharT = __detail::__value_type_is_char_or_char8_t<_Source>>
 inline path
 u8path(const _Source& __source)
 {
 #ifdef

[PATCH 3/4]: C++ P1423R3 char8_t remediation: Updates to existing tests

2019-09-15 Thread Tom Honermann

This patch updates existing tests to validate the new value for the
__cpp_lib_char8_t feature test macros and to exercise u8path factory
function invocations with std::string, std::string_view, and interator
pair arguments.

libstdc++-v3/ChangeLog:

2019-09-15  Tom Honermann  

 * libstdc++-v3/testsuite/experimental/feat-char8_t.cc: Updated the
   expected __cpp_lib_char8_t feature test macro value.
 * libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc:
   Added testing of u8path invocation with std::string,
   std::string_view, and iterators thereof.
 * 
libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc:

   Added testing of u8path invocation with std::string,
   std::string_view, and iterators thereof.

Tom.
diff --git a/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc
index aff722b5867..fb337ce1284 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc
@@ -19,6 +19,7 @@
 // { dg-do run { target c++17 } }
 
 #include 
+#include 
 #include 
 
 namespace fs = std::filesystem;
@@ -34,6 +35,22 @@ test01()
 
   p = fs::u8path("\xf0\x9d\x84\x9e");
   VERIFY( p.u8string() == u8"\U0001D11E" );
+
+  std::string s1 = "filename2";
+  p = fs::u8path(s1);
+  VERIFY( p.u8string() == u8"filename2" );
+
+  std::string s2 = "filename3";
+  p = fs::u8path(s2.begin(), s2.end());
+  VERIFY( p.u8string() == u8"filename3" );
+
+  std::string_view sv1{ s1 };
+  p = fs::u8path(sv1);
+  VERIFY( p.u8string() == u8"filename2" );
+
+  std::string_view sv2{ s2 };
+  p = fs::u8path(sv2.begin(), sv2.end());
+  VERIFY( p.u8string() == u8"filename3" );
 }
 
 void
diff --git a/libstdc++-v3/testsuite/experimental/feat-char8_t.cc b/libstdc++-v3/testsuite/experimental/feat-char8_t.cc
index e843604266c..c9b277a4626 100644
--- a/libstdc++-v3/testsuite/experimental/feat-char8_t.cc
+++ b/libstdc++-v3/testsuite/experimental/feat-char8_t.cc
@@ -12,6 +12,6 @@
 
 #ifndef  __cpp_lib_char8_t
 #  error "__cpp_lib_char8_t"
-#elif  __cpp_lib_char8_t != 201811L
-#  error "__cpp_lib_char8_t != 201811L"
+#elif  __cpp_lib_char8_t != 201907L
+#  error "__cpp_lib_char8_t != 201907L"
 #endif
diff --git a/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc b/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc
index bdeb3946a15..83219b7ddda 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path.cc
@@ -35,6 +35,14 @@ test01()
 
   p = fs::u8path("\xf0\x9d\x84\x9e");
   VERIFY( p.u8string() == u8"\U0001D11E" );
+
+  std::string s1 = "filename2";
+  p = fs::u8path(s1);
+  VERIFY( p.u8string() == u8"filename2" );
+
+  std::string s2 = "filename3";
+  p = fs::u8path(s2.begin(), s2.end());
+  VERIFY( p.u8string() == u8"filename3" );
 }
 
 void


[PATCH 4/4]: C++ P1423R3 char8_t remediation: New tests

2019-09-15 Thread Tom Honermann
This patch adds new tests to validate new deleted overloads of wchar_t, 
char8_t, char16_t, and char32_t for ordinary and wide formatted 
character and string ostream inserters.


Additionally, new tests are added to validate invocations of u8path with 
sequences of char8_t for both the C++17 and filesystem TS implementations.


libstdc++-v3/ChangeLog:

2019-09-15  Tom Honermann  

 * 
libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc:

   New test to validate deleted overloads of character and string
   inserters for narrow ostreams.
 * 
libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc:

   New test to validate deleted overloads of character and string
   inserters for wide ostreams.
 * 
libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc:

   New test to validate u8path invocations with sequences of
   char8_t.
 * 
libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc

   New test to validate u8path invocations with sequences of
   char8_t.

Tom.
diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc
new file mode 100644
index 000..87afb295086
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc
@@ -0,0 +1,43 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// 29.7.2 Header  synopsys; deleted character inserters.
+
+// Test character inserters defined as deleted by P1423.
+
+// { dg-options "-std=gnu++17 -fchar8_t" }
+// { dg-do compile { target c++17 } }
+
+#include 
+
+void test_character_inserters(std::ostream &os)
+{
+  os << 'x';   // ok.
+  os << L'x';  // { dg-error "use of deleted function" }
+  os << u8'x'; // { dg-error "use of deleted function" }
+  os << u'x';  // { dg-error "use of deleted function" }
+  os << U'x';  // { dg-error "use of deleted function" }
+}
+
+void test_string_inserters(std::ostream &os)
+{
+  os << "text";  // ok.
+  os << L"text";  // { dg-error "use of deleted function" }
+  os << u8"text"; // { dg-error "use of deleted function" }
+  os << u"text";  // { dg-error "use of deleted function" }
+  os << U"text";  // { dg-error "use of deleted function" }
+}
diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc
new file mode 100644
index 000..701de16822b
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc
@@ -0,0 +1,43 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// 29.7.2 Header  synopsys; deleted character inserters.
+
+// Test wide character inserters defined as deleted by P1423.
+
+// { dg-options "-std=gnu++17 -fchar8_t" }
+// { dg-do compile { target c++17 } }
+
+#include 
+
+void test_character_inserters(std::wostream &os)
+{
+  os << 'x';   // ok.
+  os << L'x';  // ok.
+  os << u8'x'; // { dg-error "use of deleted function" }
+  os << u'x';  // { dg-error "use of d

Re: [PATCH 2/4]: C++ P1423R3 char8_t remediation: Update feature test macro, add deleted operators, update u8path

2019-09-15 Thread Tom Honermann
A revised patch is attached that adds proper preprocessor conditionals 
around the deleted ostream inserters.  Apparently I had previously 
implemented a quick hack for testing purposes, neglected to add a FIXME 
comment, and then forgot about the hack.  Shame on me.


Tom.

On 9/15/19 3:39 PM, Tom Honermann wrote:
This patch increments the __cpp_lib_char8_t feature test macro, adds 
deleted operator<< overloads for basic_ostream, and modifies u8path to 
accept sequences of char8_t for both the C++17 implementation of 
std::filesystem, and the filesystem TS implementation.


The implementation mechanism used for u8path differs between the C++17 
and filesystem TS implementations.  The changes to the former take 
advantage of C++17 'if constexpr'.  The changes to the latter retain 
C++11 compatibility and rely on tag dispatching.


libstdc++-v3/ChangeLog:

2019-09-15  Tom Honermann  

  * libstdc++-v3/include/bits/c++config: Bumped the value of the
    __cpp_lib_char8_t feature test macro.
  * libstdc++-v3/include/bits/fs_path.h (u8path): Modified u8path to
    accept sequences of char8_t.
  * libstdc++-v3/include/experimental/bits/fs_path.h (u8path):
    Modified u8path to accept sequences of char8_t.
  * libstdc++-v3/include/std/ostream: Added deleted overloads of
    wchar_t, char8_t, char16_t, and char32_t for ordinary and wide
    formatted character and string inserters.

Tom.


diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config
index c8e099aaadd..5bcf32d95ef 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -620,7 +620,7 @@ namespace std
 # endif
 #endif
 #ifdef _GLIBCXX_USE_CHAR8_T
-# define __cpp_lib_char8_t 201811L
+# define __cpp_lib_char8_t 201907L
 #endif
 
 /* Define if __float128 is supported on this host. */
diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h
index 71354515403..f3f539412fc 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -153,9 +153,24 @@ namespace __detail
 
   template())),
-	   typename _Val = typename std::iterator_traits<_Iter>::value_type>
+	   typename _Val = typename std::iterator_traits<_Iter>::value_type,
+	   typename _UnqualVal = std::remove_const_t<_Val>>
 using __value_type_is_char
-  = std::enable_if_t, char>>;
+  = std::enable_if_t,
+			 _UnqualVal>;
+
+  template())),
+	   typename _Val = typename std::iterator_traits<_Iter>::value_type,
+	   typename _UnqualVal = std::remove_const_t<_Val>>
+using __value_type_is_char_or_char8_t
+  = std::enable_if_t<__or_v<
+			   std::is_same<_UnqualVal, char>
+#ifdef _GLIBCXX_USE_CHAR8_T
+			   ,std::is_same<_UnqualVal, char8_t>
+#endif
+			   >,
+			 _UnqualVal>;
 
   // @} group filesystem-detail
 } // namespace __detail
@@ -639,29 +654,41 @@ namespace __detail
   /// Create a path from a UTF-8-encoded sequence of char
   template,
-	   typename _Require2 = __detail::__value_type_is_char<_InputIterator>>
+	   typename _CharT =
+	 __detail::__value_type_is_char_or_char8_t<_InputIterator>>
 inline path
 u8path(_InputIterator __first, _InputIterator __last)
 {
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-  // XXX This assumes native wide encoding is UTF-16.
-  std::codecvt_utf8_utf16 __cvt;
-  path::string_type __tmp;
-  if constexpr (is_pointer_v<_InputIterator>)
+#ifdef _GLIBCXX_USE_CHAR8_T
+  if constexpr (is_same_v<_CharT, char8_t>)
 	{
-	  if (__str_codecvt_in_all(__first, __last, __tmp, __cvt))
-	return path{ __tmp };
+	  return path{ __first, __last };
 	}
   else
 	{
-	  const std::string __u8str{__first, __last};
-	  const char* const __ptr = __u8str.data();
-	  if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt))
-	return path{ __tmp };
+#endif
+	  // XXX This assumes native wide encoding is UTF-16.
+	  std::codecvt_utf8_utf16 __cvt;
+	  path::string_type __tmp;
+	  if constexpr (is_pointer_v<_InputIterator>)
+	{
+	  if (__str_codecvt_in_all(__first, __last, __tmp, __cvt))
+		return path{ __tmp };
+	}
+	  else
+	{
+	  const std::string __u8str{__first, __last};
+	  const char* const __ptr = __u8str.data();
+	  if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt))
+		return path{ __tmp };
+	}
+	  _GLIBCXX_THROW_OR_ABORT(filesystem_error(
+	  "Cannot convert character sequence",
+	  std::make_error_code(errc::illegal_byte_sequence)));
+#ifdef _GLIBCXX_USE_CHAR8_T
 	}
-  _GLIBCXX_THROW_OR_ABORT(filesystem_error(
-	"Cannot convert character sequence",
-	std::make_error_code(errc::illegal_byte_sequence)));
+#endif
 #else
   // This assumes native normal encoding is UTF-8.
   return path{ __first, __last };
@@ 

Re: [PATCH 4/4]: C++ P1423R3 char8_t remediation: New tests

2019-09-15 Thread Tom Honermann
A revised patch is attached that modifies the tests for deleted ostream 
inserters to require C++2a.  This is required by the revision of patch 
2/4 that adds proper preprocessor conditionals to the definitions.


Tom.

On 9/15/19 3:40 PM, Tom Honermann wrote:
This patch adds new tests to validate new deleted overloads of wchar_t, 
char8_t, char16_t, and char32_t for ordinary and wide formatted 
character and string ostream inserters.


Additionally, new tests are added to validate invocations of u8path with 
sequences of char8_t for both the C++17 and filesystem TS implementations.


libstdc++-v3/ChangeLog:

2019-09-15  Tom Honermann  

  * 
libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc: 


    New test to validate deleted overloads of character and string
    inserters for narrow ostreams.
  * 
libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc: 


    New test to validate deleted overloads of character and string
    inserters for wide ostreams.
  * 
libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc:

    New test to validate u8path invocations with sequences of
    char8_t.
  * 
libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc 


    New test to validate u8path invocations with sequences of
    char8_t.

Tom.


commit b7eb4714cc2c999ce0491358fcbcebf4a8723185
Author: Tom Honermann 
Date:   Sun Sep 15 22:25:28 2019 -0400

P1423R3 char8_t remediation: New tests

This patch adds new tests to validate new deleted overloads of wchar_t,
char8_t, char16_t, and char32_t for ordinary and wide formatted
character and string ostream inserters.

Additionally, new tests are added to validate invocations of u8path with
sequences of char8_t for both the C++17 and filesystem TS implementations.

diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc
new file mode 100644
index 000..f2eb538f42e
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc
@@ -0,0 +1,43 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// 29.7.2 Header  synopsys; deleted character inserters.
+
+// Test character inserters defined as deleted by P1423.
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+
+#include 
+
+void test_character_inserters(std::ostream &os)
+{
+  os << 'x';   // ok.
+  os << L'x';  // { dg-error "use of deleted function" }
+  os << u8'x'; // { dg-error "use of deleted function" }
+  os << u'x';  // { dg-error "use of deleted function" }
+  os << U'x';  // { dg-error "use of deleted function" }
+}
+
+void test_string_inserters(std::ostream &os)
+{
+  os << "text";  // ok.
+  os << L"text";  // { dg-error "use of deleted function" }
+  os << u8"text"; // { dg-error "use of deleted function" }
+  os << u"text";  // { dg-error "use of deleted function" }
+  os << U"text";  // { dg-error "use of deleted function" }
+}
diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc
new file mode 100644
index 000..1422a01aab3
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc
@@ -0,0 +1,43 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY

[PATCH 0/9]: C++ P0482R5 char8_t implementation

2018-11-05 Thread Tom Honermann
This series of patches provides an implementation of the core language 
and library changes for C++ proposal P0482R5 [1].  These changes are 
believed to be complete with the exception of the proposed mbrtoc8() and 
c8rtomb() functions (the expectation is that the C library will provide 
mbrtoc8() and c8rtomb(); future patches will address that support and 
integration).


These changes do not impact default gcc behavior.  A new -fchar8_t 
option is provided to enable the P0482R5 changes, and -fno-char8_t is 
provided to explicitly disable them.


Patch 1: Documentation updates
Patch 2: Core language support
Patch 3: New core language tests
Patch 4: Updates to existing core language tests
Patch 5: Standard library support
Patch 6: A small correction to a common testsuite header file
Patch 7: New standard library tests
Patch 8: Updates to existing standard library tests
Patch 9: Updates to gdb pretty printing support

Tom.

[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0482r5.html


[PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates

2018-11-05 Thread Tom Honermann

This patch adds documentation for new -fchar8_t and -fno-char8_t options.

gcc/ChangeLog:

2018-11-04  Tom Honermann  
 * doc/invoke.texi (-fchar8_t): Document new option.

Tom.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 57491f1033c..cd3a2a715db 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -206,7 +206,7 @@ in the following sections.
 @item C++ Language Options
 @xref{C++ Dialect Options,,Options Controlling C++ Dialect}.
 @gccoptlist{-fabi-version=@var{n}  -fno-access-control @gol
--faligned-new=@var{n}  -fargs-in-order=@var{n}  -fcheck-new @gol
+-faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t -fcheck-new @gol
 -fconstexpr-depth=@var{n}  -fconstexpr-loop-limit=@var{n} @gol
 -fno-elide-constructors @gol
 -fno-enforce-eh-specs @gol
@@ -2432,6 +2432,53 @@ but few users will need to override the default of
 
 This flag is enabled by default for @option{-std=c++17}.
 
+@item -fchar8_t
+@itemx -fno-char8_t
+@opindex fchar8_t
+@opindex fno-char8_t
+Enable support for the P0482 proposal including the addition of a
+new @code{char8_t} fundamental type, changes to the types of UTF-8
+string and character literals, new signatures for user defined
+literals, and new specializations of standard library class templates
+@code{std::numeric_limits}, @code{std::char_traits},
+and @code{std::hash}.
+
+This option enables functions to be overloaded for ordinary and UTF-8
+strings:
+
+@smallexample
+int f(const char *);// #1
+int f(const char8_t *); // #2
+int v1 = f("text"); // Calls #1
+int v2 = f(u8"text");   // Calls #2
+@end smallexample
+
+and introduces new signatures for user defined literals:
+
+@smallexample
+int operator""_udl1(char8_t);
+int v3 = u8'x'_udl1;
+int operator""_udl2(const char8_t*, std::size_t);
+int v4 = u8"text"_udl2;
+template int operator""_udl3();
+int v5 = u8"text"_udl3;
+@end smallexample
+
+The change to the types of UTF-8 string and character literals introduces
+incompatibilities with ISO C++11 and later standards.  For example, the
+following code is well-formed under ISO C++11, but is ill-formed when
+@option{-fchar8_t} is specified.
+
+@smallexample
+char ca[] = u8"text";   // error: char-array initialized from wide string
+const char *cp = u8"text";  // error: invalid conversion from 'const char8_t*' to 'const char*'
+int f(const char*);
+auto v = f(u8"text");   // error: invalid conversion from 'const char8_t*' to 'const char*'
+std::string s1@{u8"text"@};   // error: no matching function for call to 'std::basic_string::basic_string()'
+using namespace std::literals;
+std::string s2 = u8"text"s; // error: conversion from 'basic_string' to non-scalar type 'basic_string' requested
+@end smallexample
+
 @item -fcheck-new
 @opindex fcheck-new
 Check that the pointer returned by @code{operator new} is non-null



[PATCH 2/9]: C++ P0482R5 char8_t: Core language support

2018-11-05 Thread Tom Honermann
This patch adds support for the P0482R5 core language changes.  This 
includes:

- The -fchar8_t and -fno_char8_t command line options.
- char8_t as a keyword.
- The char8_t builtin type as a non-aliasing unsigned integral
  character type of size 1.
- Use of char8_t as a simple type specifier.
- u8 character literals with type char8_t.
- u8 string literals with type array of const char8_t.
- User defined literal operators that accept char8_1 and char8_t pointer
  types.
- New __cpp_char8_t predefined feature test macro.
- New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined
  macros .
- Name mangling and demangling for char8_t (using Du).

gcc/ChangeLog:

2018-11-04  Tom Honermann  

 * defaults.h: Define CHAR8_TYPE.

gcc/c-family/ChangeLog:

2018-11-04  Tom Honermann  
 * c-family/c-common.c (c_common_reswords): Add char8_t.
 (fix_string_type): Use char8_t for the type of u8 string literals.
 (c_common_get_alias_set): char8_t doesn't alias.
 (c_common_nodes_and_builtins): Define char8_t as a builtin type in
 C++.
 (c_stddef_cpp_builtins): Add __CHAR8_TYPE__.
 (keyword_begins_type_specifier): Add RID_CHAR8.
 * gcc/c-family/c-common.h (rid): Add RID_CHAR8.
 (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE.
 Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS.
 Define char8_type_node and char8_array_type_node.
 * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine
 __GCC_ATOMIC_CHAR8_T_LOCK_FREE.
 (c_cpp_builtins): Predefine __cpp_char8_t.
 * c-family/c-lex.c (lex_string): Use char8_array_type_node as the
 type of CPP_UTF8STRING.
 (lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR.
 * c-family/c.opt: Add the -fchar8_t command line option.

gcc/c/ChangeLog:

2018-11-04  Tom Honermann  

 * c/c-typeck.c (char_type_p): Add char8_type_node.
 (digest_init): Handle initialization by a u8 string literal of
 char8_t type.

gcc/cp/ChangeLog:

2018-11-04  Tom Honermann  

 * cp/cvt.c (type_promotes_to): Handle char8_t promotion.
 * cp/decl.c (grokdeclarator): Handle invalid type specifier
 combinations involving char8_t.
 * cp/lex.c (init_reswords): Add char8_t as a reserved word.
 * cp/mangle.c (write_builtin_type): Add name mangling for char8_t
 (Du).
 * cp/parser.c (cp_keyword_starts_decl_specifier_p,
 cp_parser_simple_type_specifier): Recognize char8_t as a simple
 type specifier.
 (cp_parser_string_literal): Use char8_array_type_node for the type
 of CPP_UTF8STRING.
 (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in system
 headers.
 * cp/rtti.c (emit_support_tinfos): type_info support for char8_t.
 * cp/tree.c (char_type_p): Recognize char8_t as a character type.
 * cp/typeck.c (string_conv_p): Handle conversions of u8 string
 literals of char8_t type.
 (check_literal_operator_args): Handle UDLs with u8 string literals
 of char8_t type.
 * cp/typeck2.c (digest_init_r): Disallow initializing a char array
 with a u8 string literal.

libiberty/ChangeLog:

2018-10-31  Tom Honermann  
 * cp-demangle.c (cplus_demangle_builtin_types,
 cplus_demangle_type): Add name demangling for char8_t (Du).
 * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate the
 new char8_t type.

Tom.
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index f10cf89c3a7..c7d88eb9a22 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -79,6 +79,7 @@ machine_mode c_default_pointer_mode = VOIDmode;
 	tree signed_char_type_node;
 	tree wchar_type_node;
 
+	tree char8_type_node;
 	tree char16_type_node;
 	tree char32_type_node;
 
@@ -128,6 +129,11 @@ machine_mode c_default_pointer_mode = VOIDmode;
 
 	tree wchar_array_type_node;
 
+   Type `char8_t[SOMENUMBER]' or something like it.
+   Used when a UTF-8 string literal is created.
+
+	tree char8_array_type_node;
+
Type `char16_t[SOMENUMBER]' or something like it.
Used when a UTF-16 string literal is created.
 
@@ -450,6 +456,7 @@ const struct c_common_resword c_common_reswords[] =
   { "case",		RID_CASE,	0 },
   { "catch",		RID_CATCH,	D_CXX_OBJC | D_CXXWARN },
   { "char",		RID_CHAR,	0 },
+  { "char8_t",		RID_CHAR8,	D_CXX_CHAR8_T_FLAGS | D_CXXWARN },
   { "char16_t",		RID_CHAR16,	D_CXXONLY | D_CXX11 | D_CXXWARN },
   { "char32_t",		RID_CHAR32,	D_CXXONLY | D_CXX11 | D_CXXWARN },
   { "class",		RID_CLASS,	D_CXX_OBJC | D_CXXWARN },
@@ -746,6 +753,11 @@ fix_string_type (tree value)
   nchars = length;
   e_type = char_type_node;
 }
+  else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node)
+{
+  nchars = length / (TYPE_PRECISION (char8_type_node) / BITS_PER_UNIT);
+  e_type = char8_type_node;
+}
   else if (TREE_TYPE (value) == char16_array_type_node)
 {
   nchars = length / (TYPE_P

[PATCH 3/9]: C++ P0482R5 char8_t: New core language tests

2018-11-05 Thread Tom Honermann
This patch adds new tests to exercise new behavior for when support for 
char8_t is enabled as well as protect against unintended behavioral 
impact when support for char8_t is not enabled.  In some cases, existing 
tests suffice to exercise existing behavior and such tests have been 
cloned to validate behavior when char8_t is enabled.  In other cases, 
tests are added to validate behavior both when char8_t support is and is 
not enabled.


gcc/testsuite/ChangeLog:

2018-11-04  Tom Honermann  
 * g++.dg/cpp0x/udlit-implicit-conv-neg-char8_t.C: New test cloned
 from udlit-implicit-conv-neg.C.  Validates handling of ill-formed
 uses of char8_t based user defined literals.
 * g++.dg/cpp0x/udlit-resolve-char8_t.C: New test cloned from
 udlit-resolve.C.  Validates handling of well-formed uses of char8_t
 based user defined literals.
 * g++.dg/ext/char8_t-aliasing-1.C: New test; validates warnings
 for type punning with char8_t types.  Illustrates that char8_t does
 not alias.
 * g++.dg/ext/char8_t-char-literal-1.C: New test; validates u8
 character literals have type char if char8_t support is not
 enabled.
 * g++.dg/ext/char8_t-char-literal-2.C: New test; validates u8
 character literals have type char8_t if char8_t support is
 enabled.
 * g++.dg/ext/char8_t-deduction-1.C: New test; validates char is
 deduced for u8 character and string literals if char8_t support is
 not enabled.
 * g++.dg/ext/char8_t-deduction-2.C: New test; validates char8_t is
 deduced for u8 character and string literals if char8_t support is
 enabled.
 * g++.dg/ext/char8_t-feature-test-macro-1.C: New test; validates
 that the __cpp_char8_t feature test macro is not defined if char8_t
 support is not enabled.
 * g++.dg/ext/char8_t-feature-test-macro-2.C: New test; validates
 that the __cpp_char8_t feature test macro is defined with the
 correct value if char8_t support is enabled.
 * g++.dg/ext/char8_t-init-1.C: New test; validates initialization
 by u8 character and string literals when support for char8_t is not
 enabled.
 * g++.dg/ext/char8_t-init-2.C: New test; validates initialization
 by u8 character and string literals when support for char8_t is
 enabled.
 * g++.dg/ext/char8_t-keyword-1.C: New test; validates that char8_t
 is not a keyword if support for char8_t is not enabled.
 * g++.dg/ext/char8_t-keyword-2.C: New test; validates that char8_t
 is a keyword if support for char8_t is enabled.
 * g++.dg/ext/char8_t-limits-1.C: New test; validates that char8_t
 is unsigned and sufficiently large to store the required range of
 char8_t values.
 * g++.dg/ext/char8_t-overload-1.C: New test; validates overload
 resolution for u8 character and string literal arguments when
 support for char8_t is not enabled.
 * g++.dg/ext/char8_t-overload-2.C: New test; validates overload
 resolution for u8 character and string literal arguments when
 support for char8_t is enabled.
 * g++.dg/ext/char8_t-predefined-macros-1.C: New test; validates
 that the __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE
 predefined macros are not defined when support for char8_t is not
 enabled.
 * g++.dg/ext/char8_t-predefined-macros-2.C: New test; validates
 that the __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE
 predefined macros are defined when support for char8_t is enabled.
 * g++.dg/ext/char8_t-sizeof-1.C: New test; validates that the size
 of char8_t and u8 character literals is 1.
 * g++.dg/ext/char8_t-specialization-1.C: New test; validate
 template specialization for u8 character literal template
 arguments when support for char8_t is not enabled.
 * g++.dg/ext/char8_t-specialization-2.C: New test; validate
 template specialization for char8_t and u8 character literal
 template arguments when support for char8_t is enabled.
 * g++.dg/ext/char8_t-string-literal-1.C: New test; validate the
 type of u8 string literals when support for char8_t is not enabled.
 * g++.dg/ext/char8_t-string-literal-2.C: New test; validate the
 type of u8 string literals when support for char8_t is enabled.
 * g++.dg/ext/char8_t-type-specifier-1.C: New test; validate that
 char8_t is not recognized as a type specifier when support for
 char8_t is not enabled.
 * g++.dg/ext/char8_t-type-specifier-2.C: New test; validate that
 char8_t is recognized as a type specifier when support for char8_t
 is enabled.
 * g++.dg/ext/char8_t-typedef-1.C: New test; validate declarations
 of char8_t as a typedef are accepted when support for char8_t is
 not enabled.
 * g++.dg/ext/char8_t-typedef-2.C: New test; validate declarations
 of char8_t as a typedef are not accepted when support for char8_t
 is enabled.
 * g++.dg/ext/char8_t-udl-1.C: New test; validates overloading for
 u8

[PATCH 6/9]: C++ P0482R5 char8_t: A small correction to a common testsuite header file

2018-11-05 Thread Tom Honermann
This patch corrects ambiguous partial specializations of 
typelist::detail::append_.  Previously, neither append_, 
Typelist_Chain> nor append_ was a better 
match for append_, null_type>.


libstdc++-v3/ChangeLog:

2018-11-04  Tom Honermann  

 * include/ext/typelist.h: Constrained a partial specialization of
   typelist::detail::append_ to only match chain.

Tom.
diff --git a/libstdc++-v3/include/ext/typelist.h b/libstdc++-v3/include/ext/typelist.h
index b21f01ffb43..2cdbc3efafa 100644
--- a/libstdc++-v3/include/ext/typelist.h
+++ b/libstdc++-v3/include/ext/typelist.h
@@ -215,10 +215,10 @@ namespace detail
   typedef Typelist_Chain 			  		type;
 };
 
-  template
-struct append_
+  template
+struct append_, null_type>
 {
-  typedef Typelist_Chain 	type;
+  typedef chain  	type;
 };
 
   template<>


[PATCH 5/9]: C++ P0482R5 char8_t: Standard library support

2018-11-05 Thread Tom Honermann
This patch adds support to libstdc++ for the P0482R5 standard library 
changes.  This includes:

- New char8_t based specializations:
  - std::numeric_limits
  - std::char_traits
  - std::hash
  - std::hash
  - std::hash
  - std::codecvt
  - std::codecvt
  - std::codecvt_byname
  - std::codecvt_byname
- New char8_t overloads:
  - u8string operator "" s(const char8_t* str, size_t len);
  - u8string_view operator""sv(const char8_t* str, size_t len);
- New type aliases:
  - std::u8string
  - std::u8string_view
  - std::atomic_char8_t
- Changed function signatures:
  - filesystem::path::u8string() returns u8string.
  - filesystem::path::generic_u8string() returns u8string.
- typeinfo for char8_t.
- New macros:
  - __cpp_lib_char8_t
  - ATOMIC_CHAR8_T_LOCK_FREE

For types and templates that existed in an experimental form prior to 
standardization, both the experimental and standardized variants have 
been updated.  The updates to the experimental versions are optional.


I'm not very familiar with how ABI versioning is done and I'm not 
confident that the changes in the .ver files are correct.  In 
particular, I'm unsure as to whether a CXXABI_3.0 section may be needed 
in gnu-versioned-namespace.ver and whether I'm correct in adding a new 
CXXABI_1.3.12 section in gnu.ver.  If I'm not mistaken, CXXABI has not 
already been bumped for gcc 9, so needs to be, but GLIBCXX has already 
been bumped and therefore does not need to be.


gcc/cp/ChangeLog:

2018-11-04  Tom Honermann  

 * name-lookup.c (get_std_name_hint): Added u8string as a name hint.

libstdc++-v3/ChangeLog:

2018-11-04  Tom Honermann  

 * config/abi/pre/gnu-versioned-namespace.ver (CXXABI_2.0): Add
 typeinfo symbols for char8_t.
 * config/abi/pre/gnu.ver: Add CXXABI_1.3.12.
 (GLIBCXX_3.4.26): Add symbols for specializations of
 numeric_limits and codecvt that involve char8_t.
 (CXXABI_1.3.12): Add typeinfo symbols for char8_t.
 * include/bits/atomic_base.h: Add atomic_char8_t.
 * include/bits/basic_string.h: Add std::hash and
 operator""s(const char8_t*, size_t).
 * include/bits/c++config: Define _GLIBCXX_USE_CHAR8_T and
 __cpp_lib_char8_t.
 * include/bits/char_traits.h: Add char_traits.
 * include/bits/codecvt.h: Add
 codecvt,
 codecvt,
 codecvt_byname, and
 codecvt_byname.
 * include/bits/cpp_type_traits.h: Add __is_integer to
 recognize char8_t as an integral type.
 * include/bits/fs_path.h: (path::__is_encoded_char): Recognize
 char8_t.
 (path::u8string): Return std::u8string when char8_t support is
 enabled.
 (path::generic_u8string): Likewise.
 (path::_S_convert): Handle conversion from char8_t input.
 (path::_S_str_convert): Likewise.
 * include/bits/functional_hash.h: Add hash.
 * include/bits/locale_conv.h (__str_codecvt_out): Add overloads for
 char8_t.
 * include/bits/locale_facets.h (_GLIBCXX_NUM_UNICODE_FACETS): Bump
 for new char8_t specializations.
 * include/bits/localefwd.h: Add missing declarations of
 codecvt and
 codecvt.  Add char8_t declarations
 codecvt and
 codecvt.
 * include/bits/postypes.h: Add u8streampos
 * include/bits/stringfwd.h: Add declarations of
 char_traits and u8string.
 * include/c_global/cstddef: Add __byte_operand.
 * include/experimental/bits/fs_path.h (path::__is_encoded_char):
 Recognize char8_t.
 (path::u8string): Return std::u8string when char8_t support is
 enabled.
 (path::generic_u8string): Likewise.
 (path::_S_convert): Handle conversion from char8_t input.
 (path::_S_str_convert): Likewise.
 * include/experimental/string: Add u8string.
 * include/experimental/string_view: Add u8string_view,
 hash, and
 operator""sv(const char8_t*, size_t).
 * include/std/atomic: Add atomic and atomic_char8_t.
 * include/std/charconv (__is_int_to_chars_type): Recognize char8_t
 as a character type.
 * include/std/limits: Add numeric_limits.
 * include/std/string_view: Add u8string_view,
 hash, and
 operator""sv(const char8_t*, size_t).
 * include/std/type_traits: Add __is_integral_helper,
 __make_unsigned, and __make_signed.
 * libsupc++/atomic_lockfree_defines.h: Define
 ATOMIC_CHAR8_T_LOCK_FREE.
 * src/c++11/Makefile.am: Compile with -fchar8_t when compiling
 codecvt.cc and limits.cc so that char8_t specializations of
 numeric_limits and codecvt and emitted.
 * src/c++11/Makefile.in: Likewise.
 * src/c++11/codecvt.cc: Define members of
 codecvt,
 codecvt,
 codecvt_byname, and
 codecvt_byname.
 * src/c++11/limits.cc: Define members of
 numeric_limits.
 * src/c++98/Makefile.am: Compile with -fchar8_t when compiling
 locale_init.cc and localename.cc.
 * src/c++98/Makefile.in: Likewise.
 * src/c++98/locale_init.cc: Add initialization f

[PATCH 4/9]: C++ P0482R5 char8_t: Updates to existing core language tests

2018-11-05 Thread Tom Honermann
This patch updates existing testing gaps related to support for u8 
character and string literals.  None of these changes exercise new 
char8_t functionality; they are intended to guard against regressions in 
behavior of u8 literals when support for char8_t is not enabled.


gcc/testsuite/ChangeLog:

2018-11-04  Tom Honermann  

 * c-c++-common/raw-string-13.c: Added test cases for u8 raw string
 literals.
 * c-c++-common/raw-string-15.c: Likewise.
 * g++.dg/cpp0x/constexpr-wstring2.C: Added test cases for u8
 literals.
 * g++.dg/ext/utf-array-short-wchar.C: Likewise.
 * g++.dg/ext/utf-array.C: Likewise.
 * g++.dg/ext/utf-cxx98.C: Likewise.
 * g++.dg/ext/utf-dflt.C: Likewise.
 * g++.dg/ext/utf-gnuxx98.C: Likewise.
 * gcc.dg/utf-array-short-wchar.c: Likewise.
 * gcc.dg/utf-array.c: Likewise.

Tom.
diff --git a/gcc/testsuite/c-c++-common/raw-string-13.c b/gcc/testsuite/c-c++-common/raw-string-13.c
index 1b37405cee9..fa11edaa7aa 100644
--- a/gcc/testsuite/c-c++-common/raw-string-13.c
+++ b/gcc/testsuite/c-c++-common/raw-string-13.c
@@ -62,6 +62,47 @@ const char s16[] = R"??(??)??";
 const char s17[] = R"?(?)??)?";
 const char s18[] = R"??(??)??)??)??";
 
+const char u800[] = u8R"??=??()??'??!??-\
+(a)#[{}]^|~";
+)??=??";
+const char u801[] = u8R"a(
+)\
+a"
+)a";
+const char u802[] = u8R"a(
+)a\
+"
+)a";
+const char u803[] = u8R"ab(
+)a\
+b"
+)ab";
+const char u804[] = u8R"a??/(x)a??/";
+const char u805[] = u8R"abcdefghijklmn??(abc)abcdefghijklmn??";
+const char u806[] = u8R"abcdefghijklm??/(abc)abcdefghijklm??/";
+const char u807[] = u8R"abc(??)\
+abc";)abc";
+const char u808[] = u8R"def(de)\
+def";)def";
+const char u809[] = u8R"a(??)\
+a"
+)a";
+const char u810[] = u8R"a(??)a\
+"
+)a";
+const char u811[] = u8R"ab(??)a\
+b"
+)ab";
+const char u812[] = u8R"a#(a#)a??=)a#";
+const char u813[] = u8R"a#(??)a??=??)a#";
+const char u814[] = u8R"??/(x)??/
+";)??/";
+const char u815[] = u8R"??/(??)??/
+";)??/";
+const char u816[] = u8R"??(??)??";
+const char u817[] = u8R"?(?)??)?";
+const char u818[] = u8R"??(??)??)??)??";
+
 const char16_t u00[] = uR"??=??()??'??!??-\
 (a)#[{}]^|~";
 )??=??";
@@ -211,6 +252,25 @@ main (void)
   TEST (s16, "??");
   TEST (s17, "?)??");
   TEST (s18, "??"")??"")??");
+  TEST (u800, u8"??""??"")??""'??""!??""-\\\n(a)#[{}]^|~\";\n");
+  TEST (u801, u8"\n)\\\na\"\n");
+  TEST (u802, u8"\n)a\\\n\"\n");
+  TEST (u803, u8"\n)a\\\nb\"\n");
+  TEST (u804, u8"x");
+  TEST (u805, u8"abc");
+  TEST (u806, u8"abc");
+  TEST (u807, u8"??"")\\\nabc\";");
+  TEST (u808, u8"de)\\\ndef\";");
+  TEST (u809, u8"??"")\\\na\"\n");
+  TEST (u810, u8"??"")a\\\n\"\n");
+  TEST (u811, u8"??"")a\\\nb\"\n");
+  TEST (u812, u8"a#)a??""=");
+  TEST (u813, u8"??"")a??""=??");
+  TEST (u814, u8"x)??""/\n\";");
+  TEST (u815, u8"??"")??""/\n\";");
+  TEST (u816, u8"??");
+  TEST (u817, u8"?)??");
+  TEST (u818, u8"??"")??"")??");
   TEST (u00, u"??""??"")??""'??""!??""-\\\n(a)#[{}]^|~\";\n");
   TEST (u01, u"\n)\\\na\"\n");
   TEST (u02, u"\n)a\\\n\"\n");
diff --git a/gcc/testsuite/c-c++-common/raw-string-15.c b/gcc/testsuite/c-c++-common/raw-string-15.c
index 9dfdaabd87d..1d101dc8393 100644
--- a/gcc/testsuite/c-c++-common/raw-string-15.c
+++ b/gcc/testsuite/c-c++-common/raw-string-15.c
@@ -62,6 +62,47 @@ const char s16[] = R"??(??)??";
 const char s17[] = R"?(?)??)?";
 const char s18[] = R"??(??)??)??)??";
 
+const char u800[] = u8R"??=??()??'??!??-\
+(a)#[{}]^|~";
+)??=??";
+const char u801[] = u8R"a(
+)\
+a"
+)a";
+const char u802[] = u8R"a(
+)a\
+"
+)a";
+const char u803[] = u8R"ab(
+)a\
+b"
+)ab";
+const char u804[] = u8R"a??/(x)a??/";
+const char u805[] = u8R"abcdefghijklmn??(abc)abcdefghijklmn??";
+const char u806[] = u8R"abcdefghijklm??/(abc)abcdefghijklm??/";
+const char u807[] = u8R"abc(??)\
+abc";)abc";
+const char u808[] = u8R"def(de)\
+def";)def";
+const char u809[] = u8R"a(??)\
+a"
+)a";
+const char u810[] = u8R&quo

[PATCH 8/9]: C++ P0482R5 char8_t: Updates to existing standard library tests

2018-11-05 Thread Tom Honermann
This patch augments existing tests to validate behavior for char8_t.  In 
all cases, added test cases are cloned from existing tests for wchar_t 
or char16_t.


A few tests required updates to line numbers for diagnostic messages.

libstdc++-v3/ChangeLog:

2018-11-04  Tom Honermann  

 * testsuite/18_support/byte/ops.cc: Validate
 std::to_integer, std::to_integer, and
 std::to_integer.
 * testsuite/18_support/numeric_limits/dr559.cc: Validate
 std::numeric_limits.
 * testsuite/18_support/numeric_limits/lowest.cc: Validate
 std::numeric_limits::lowest().
 * testsuite/18_support/numeric_limits/max_digits10.cc: Validate
 std::numeric_limits::max_digits10.
 * testsuite/18_support/type_info/fundamental.cc: Validate
 typeinfo for char8_t.
 * testsuite/20_util/from_chars/1_neg.cc: Validate std::from_chars
 with char8_t.
 * testsuite/20_util/hash/requirements/explicit_instantiation.cc:
 Validate explicit instantiation of std::hash.
 * testsuite/20_util/is_integral/value.cc: Validate
 std::is_integral.
 * testsuite/20_util/make_signed/requirements/typedefs-4.cc:
 Validate std::make_signed.
 * testsuite/21_strings/basic_string/cons/char/deduction.cc:
 Validate u8string construction from char8_t sources.
 * testsuite/21_strings/basic_string_view/operations/compare/
 char/70483.cc: Validate substr operations on u8string_view.
 * testsuite/21_strings/basic_string_view/typedefs.cc: Validate that
 the u8string_view typedef is defined.
 * testsuite/21_strings/char_traits/requirements/
 constexpr_functions.cc: Validate char_traits constexpr
 member functions.
 * testsuite/21_strings/char_traits/requirements/
 constexpr_functions_c++17.cc: Validate char_traits C++17
 constexpr member functions.
 * testsuite/21_strings/headers/string/types_std_c++0x.cc: Validate
 that the u8string typedef is defined.
 * testsuite/22_locale/locale/cons/unicode.cc: Validate the presence
 of the std::codecvt and
 std::codecvt facets.
 * testsuite/29_atomics/atomic/cons/assign_neg.cc: Update line
 numbers.
 * testsuite/29_atomics/atomic/cons/copy_neg.cc: Likewise.
 * testsuite/29_atomics/atomic_integral/cons/assign_neg.cc:
 Likewise.
 * testsuite/29_atomics/atomic_integral/cons/copy_neg.cc: Likewise.
 * testsuite/29_atomics/atomic_integral/is_always_lock_free.cc:
 Validate std::atomic::is_always_lock_free
 * testsuite/29_atomics/atomic_integral/operators/bitwise_neg.cc:
 Update line numbers.
 * testsuite/29_atomics/atomic_integral/operators/decrement_neg.cc:
 Likewise.
 * testsuite/29_atomics/atomic_integral/operators/increment_neg.cc:
 Likewise.
 * testsuite/29_atomics/headers/atomic/macros.cc: Validate
 ATOMIC_CHAR8_T_LOCK_FREE and added a missing error message for
 ATOMIC_CHAR16_T_LOCK_FREE.
 * testsuite/29_atomics/headers/atomic/types_std_c++0x.cc: Validate
 std::atomic_char8_t.
 * testsuite/29_atomics/headers/atomic/types_std_c++0x_neg.cc:
 Validate atomic_char8_t.
 * testsuite/experimental/string_view/typedefs.cc: Validate that
 the u8string_view typedef is defined.
 * testsuite/util/testsuite_common_types.h (integral_types,
 integral_types_gnu, atomic_integrals_no_bool, atomic_integrals):
 Add char8_t to the typelist chains of integral types.

Tom.
diff --git a/libstdc++-v3/testsuite/18_support/byte/ops.cc b/libstdc++-v3/testsuite/18_support/byte/ops.cc
index 6f2755eb0a5..dfbaa8b2efa 100644
--- a/libstdc++-v3/testsuite/18_support/byte/ops.cc
+++ b/libstdc++-v3/testsuite/18_support/byte/ops.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // <http://www.gnu.org/licenses/>.
 
-// { dg-options "-std=gnu++17" }
+// { dg-options "-std=gnu++17 -fchar8_t" }
 // { dg-do compile { target c++17 } }
 
 #include 
@@ -218,7 +218,13 @@ constexpr bool test_to_integer(unsigned char c)
 
 static_assert( test_to_integer(0) );
 static_assert( test_to_integer(255) );
+static_assert( test_to_integer(0) );
 static_assert( test_to_integer(255) );
 static_assert( test_to_integer(0) );
 static_assert( test_to_integer(255) );
-
+static_assert( test_to_integer(0) );
+static_assert( test_to_integer(255) );
+static_assert( test_to_integer(0) );
+static_assert( test_to_integer(255) );
+static_assert( test_to_integer(0) );
+static_assert( test_to_integer(255) );
diff --git a/libstdc++-v3/testsuite/18_support/numeric_limits/dr559.cc b/libstdc++-v3/testsuite/18_support/numeric_limits/dr559.cc
index 150db958807..f72b265dc77 100644
--- a/libstdc++-v3/testsuite/18_support/numeric_limits/dr559.cc
+++ b/libstdc++-v3/testsuite/18_support/numeric_limits/dr559.cc
@@ -1,4 +1,5 @@
 // { dg-do run { target c++11 } }
+// { dg-options "-fchar8_t" }
 
 // 2010-02-17  Paolo Carlini  
 //
@@ -84,6 +85,9 @@ int main()
   do_test();
   do_test();
   do_test();
+#ifdef _GLIBC

[PATCH 7/9]: C++ P0482R5 char8_t: New standard library tests

2018-11-05 Thread Tom Honermann
This patch adds new tests for char8_t standard library features.  Most 
of these tests were cloned from existing tests that exercise char16_t 
and adapted for char8_t.  Only testsuite/experimental/feat-char8_t.cc 
and testsuite/ext/char8_t/atomic-1.cc are net new tests.


libstdc++-v3/ChangeLog:

2018-11-04  Tom Honermann  

 * testsuite/18_support/numeric_limits/char8_t.cc: New test cloned
 from char16_32_t.cc; validates numeric_limits.
 * testsuite/21_strings/basic_string/literals/types-char8_t.cc: New
 test cloned from types.cc; validates operator""s for char8_t
 returns u8string.
 * testsuite/21_strings/basic_string/literals/values-char8_t.cc: New
 test cloned from values.cc; validates construction and comparison
 of u8string values.
 * testsuite/21_strings/basic_string/requirements/
 /explicit_instantiation/char8_t/1.cc: New test cloned from
 char16_t/1.cc; validates explicit instantiation of
 basic_string.
 * testsuite/21_strings/basic_string_view/literals/types-char8_t.cc:
 New test cloned from types.cc; validates operator""sv for char8_t
 returns u8string_view.
 * testsuite/21_strings/basic_string_view/literals/
 values-char8_t.cc: New test cloned from values.cc; validates
 construction and comparison of u8string_view values.
 * testsuite/21_strings/basic_string_view/requirements/
 explicit_instantiation/char8_t/1.cc: New test cloned from
 char16_t/1.cc; validates explicit instantiation of
 basic_string_view.
 * testsuite/21_strings/char_traits/requirements/char8_t/65049.cc:
 New test cloned from char16_t/65049.cc; validates that
 char_traits is not vulnerable to the concerns in PR65049.
 * testsuite/21_strings/char_traits/requirements/char8_t/
 typedefs.cc: New test cloned from char16_t/typedefs.cc; validates
 that char_traits member typedefs are present and correct.
 * testsuite/21_strings/char_traits/requirements/
 explicit_instantiation/char8_t/1.cc: New test cloned from
 char16_t/1.cc; validates explicit instantiation of
 char_traits.
 * testsuite/22_locale/codecvt/char16_t-char8_t.cc: New test cloned
 from char16_t.cc: validates
 codecvt.
 * testsuite/22_locale/codecvt/char32_t-char8_t.cc: New test cloned
 from char32_t.cc: validates
 codecvt.
 * testsuite/22_locale/codecvt/utf8-char8_t.cc: New test cloned from
 utf8.cc; validates codecvt and
 codecvt.
 * testsuite/27_io/filesystem/path/native/string-char8_t.cc: New
 test cloned from string.cc; validates filesystem::path construction
 from char8_t input.
 * testsuite/experimental/feat-char8_t.cc: New test; validates that
 the __cpp_lib_char8_t feature test macro is defined with the
 correct value.
 * testsuite/experimental/filesystem/path/native/string-char8_t.cc:
 New test cloned from string.cc; validates filesystem::path
 construction from char8_t input.
 * testsuite/experimental/string_view/literals/types-char8_t.cc: New
 test cloned from types.cc; validates operator""sv for char8_t
 returns u8string_view.
 * testsuite/experimental/string_view/literals/values-char8_t.cc:
 New test cloned from values.cc; validates construction and
 comparison of u8string_view values.
 * testsuite/experimental/string_view/requirements/
 explicit_instantiation/char8_t/1.cc: New test cloned from
 char16_t/1.cc; validates explicit instantiation of
 basic_string_view.
 * testsuite/ext/char8_t/atomic-1.cc: New test; validates that
 ATOMIC_CHAR8_T_LOCK_FREE is not defined if char8_t support is not
 enabled.

Tom.
diff --git a/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc b/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc
new file mode 100644
index 000..346463d7244
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/numeric_limits/char8_t.cc
@@ -0,0 +1,71 @@
+// { dg-do run { target c++11 } }
+// { dg-require-cstdint "" }
+// { dg-options "-fchar8_t" }
+
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include 
+#include 
+#include 
+
+// Test specializations for char8_t.
+template
+  void
+  do_test()
+  {
+   

[PATCH 9/9]: C++ P0482R5 char8_t: Updates to gdb pretty printing support

2018-11-05 Thread Tom Honermann
This patch adds recognition of the u8string and u8string_view type 
aliases to the gdb pretty printer extension.


libstdc++-v3/ChangeLog:

2018-11-04  Tom Honermann  

 * python/libstdcxx/v6/printers.py (register_type_printers): Add
 type printers for u8string and u8string_view.
 * testsuite/libstdc++-prettyprinters/whatis.cc: Validate
 recognition of u8string.

Tom.
diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 827c87b70ea..f9e638e210d 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1554,7 +1554,7 @@ def register_type_printers(obj):
 return
 
 # Add type printers for typedefs std::string, std::wstring etc.
-for ch in ('', 'w', 'u16', 'u32'):
+for ch in ('', 'w', 'u8', 'u16', 'u32'):
 add_one_type_printer(obj, 'basic_string', ch + 'string')
 add_one_type_printer(obj, '__cxx11::basic_string', ch + 'string')
 # Typedefs for __cxx11::basic_string used to be in namespace __cxx11:
@@ -1604,7 +1604,7 @@ def register_type_printers(obj):
 
 # Add type printers for experimental::basic_string_view typedefs.
 ns = 'experimental::fundamentals_v1::'
-for ch in ('', 'w', 'u16', 'u32'):
+for ch in ('', 'w', 'u8', 'u16', 'u32'):
 add_one_type_printer(obj, ns + 'basic_string_view',
  ns + ch + 'string_view')
 
diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc b/libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc
index 90f3994314b..d74bf7c5e9b 100644
--- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc
+++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc
@@ -1,5 +1,5 @@
 // { dg-do run { target c++11 } }
-// { dg-options "-g -O0" }
+// { dg-options "-g -O0 -fchar8_t" }
 // { dg-skip-if "" { *-*-* } { "-D_GLIBCXX_PROFILE" } }
 
 // Copyright (C) 2011-2018 Free Software Foundation, Inc.
@@ -130,6 +130,9 @@ holder cregex_token_iterator_holder;
 std::sregex_token_iterator *sregex_token_iterator_ptr;
 holder sregex_token_iterator_holder;
 // { dg-final { whatis-test sregex_token_iterator_holder "holder" } }
+std::u8string *u8string_ptr;
+holder u8string_holder;
+// { dg-final { whatis-test u8string_holder "holder" } }
 std::u16string *u16string_ptr;
 holder u16string_holder;
 // { dg-final { whatis-test u16string_holder "holder" } }
@@ -240,6 +243,8 @@ main()
   placeholder(&cregex_token_iterator_holder);
   placeholder(&sregex_token_iterator_ptr);
   placeholder(&sregex_token_iterator_holder);
+  placeholder(&u8string_ptr);
+  placeholder(&u8string_holder);
   placeholder(&u16string_ptr);
   placeholder(&u16string_holder);
   placeholder(&u32string_ptr);


Re: [REVISED PATCH 2/9]: C++ P0482R5 char8_t: Core language support

2019-01-14 Thread Tom Honermann

On 1/14/19 2:58 PM, Jason Merrill wrote:

On 12/23/18 9:27 PM, Tom Honermann wrote:
Attached is a revised patch that addresses changes in P0482R6 as well 
as feedback provided by Jason. Changes from the prior patch include:

- Updated the value of the __cpp_char8_t feature test macro to 201811
   per P0482R6.
- Enable char8_t support with -std=c++2a per adoption of P0482R6 in
   San Diego.
- Reverted the unnecessary changes to gcc/gcc/c/c-typeck.c as requested
   by Jason.
- Removed unnecessary checks of 'flag_char8_t' within the C++ front
   end as requested by Jason.
- Corrected the regression spotted by Jason regarding initialization of
   signed char and unsigned char arrays with string literals.
- Made minor changes to the error message emitted for ill-formed
   initialization of char arrays with UTF-8 string literals. These
   changes do not yet implement Jason's suggestion; I'll follow up 
with a

   separate patch for that due to additional test impact.

Tested on x86_64-linux.


I just applied the compiler changes with small modifications, as 
follows; thank you very much for the patches.  Jonathan should check 
in the library portion before long.


Excellent, thank you, Jason!

Tom.



Jason





Re: PATCH: Updated error messages for ill-formed cases of array initialization by string literal

2019-01-14 Thread Tom Honermann

On 1/4/19 7:25 PM, Martin Sebor wrote:

On 12/27/18 1:49 PM, Tom Honermann wrote:
As requested by Jason in the review of the P0482 (char8_t) core 
language changes, this patch includes updates to the error messages 
emitted for ill-formed cases of array initialization with a string 
literal.  With these changes, error messages that previously looked 
something like these:


- "char-array initialized from wide string"
- "wide character array initialized from non-wide string"
- "wide character array initialized from incompatible wide string"

now look like:

- "cannot initialize array of type 'char' from a string literal with 
type array of 'short unsigned int'"


The first word "type" doesn't quite work here.  The type of every
array is "array of T" where T is the type of the element, so for
instance, "array of char."  Saying "array of type X" makes it sound
like X is the type of the whole array, which is of course not
the case when X is char.  I think you want to use the same wording
as for the second type:

  "cannot initialize array of 'char' from a string literal with
  type array of 'short unsigned int'"

or perhaps even better

  "cannot initialize array of 'char' from a string literal with
  type 'char16_t[N]'"

(i.e., show the actual type of the string, including its bound).


Thank you for the feedback, Martin; sorry for the delayed response.  
I'll follow up with a revised patch within the next week or two.


Tom.



Martin





Re: [REVISED PATCH 1/9]: C++ P0482R5 char8_t: Documentation updates

2019-01-14 Thread Tom Honermann

On 1/4/19 7:40 PM, Martin Sebor wrote:

On 12/23/18 7:27 PM, Tom Honermann wrote:
Attached is a revised patch that addresses feedback provided by Jason 
and Sandra.  Changes from the prior patch include:

- Updates to the -fchar8_t option documentation as requested by Jason.
- Corrections for indentation, spacing, hyphenation, and wrapping as
   requested by Sandra.



Just a minor nit that backticks in code examples should be avoided
(per the TexInfo manual, they can cause trouble when copying code
from PDF readers):

+@smallexample
+char ca[] = u8"xx"; // error: char-array initialized from wide
+    //    string
+const char *cp = u8"xx";// error: invalid conversion from
+    //    `const char8_t*' to `const char*'


Thanks for catching that, Martin.  Patch relative to trunk (r267930) 
attached to correct this (Jason already committed the original change).


Tom.



Martin



Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 267930)
+++ gcc/doc/invoke.texi	(working copy)
@@ -2468,16 +2468,16 @@
 char ca[] = u8"xx"; // error: char-array initialized from wide
 //string
 const char *cp = u8"xx";// error: invalid conversion from
-//`const char8_t*' to `const char*'
+//'const char8_t*' to 'const char*'
 int f(const char*);
 auto v = f(u8"xx"); // error: invalid conversion from
-//`const char8_t*' to `const char*'
+//'const char8_t*' to 'const char*'
 std::string s@{u8"xx"@};  // error: no matching function for call to
-//`std::basic_string::basic_string()'
+//'std::basic_string::basic_string()'
 using namespace std::literals;
 s = u8"xx"s;// error: conversion from
-//`basic_string' to non-scalar
-//type `basic_string' requested
+//'basic_string' to non-scalar
+//type 'basic_string' requested
 @end smallexample
 
 @item -fcheck-new


Re: [REVISED PATCH 2/9]: C++ P0482R5 char8_t: Core language support

2019-01-15 Thread Tom Honermann

On 1/15/19 1:51 AM, Christophe Lyon wrote:

On Mon, 14 Jan 2019 at 20:59, Jason Merrill  wrote:

On 12/23/18 9:27 PM, Tom Honermann wrote:

Attached is a revised patch that addresses changes in P0482R6 as well as
feedback provided by Jason.  Changes from the prior patch include:
- Updated the value of the __cpp_char8_t feature test macro to 201811
per P0482R6.
- Enable char8_t support with -std=c++2a per adoption of P0482R6 in
San Diego.
- Reverted the unnecessary changes to gcc/gcc/c/c-typeck.c as requested
by Jason.
- Removed unnecessary checks of 'flag_char8_t' within the C++ front
end as requested by Jason.
- Corrected the regression spotted by Jason regarding initialization of
signed char and unsigned char arrays with string literals.
- Made minor changes to the error message emitted for ill-formed
initialization of char arrays with UTF-8 string literals.  These
changes do not yet implement Jason's suggestion; I'll follow up with a
separate patch for that due to additional test impact.

Tested on x86_64-linux.

I just applied the compiler changes with small modifications, as
follows; thank you very much for the patches.  Jonathan should check in
the library portion before long.

Jason

Hi,

The new testcase g++.dg/ext/utf-cvt-char8_t.C fails at least on arm and aarch64:

g++.dg/ext/utf-cvt-char8_t.C  -std=gnu++14  (test for warnings, line 24)
g++.dg/ext/utf-cvt-char8_t.C  -std=gnu++17  (test for warnings, line 24)


Arm and aarch64 have unsigned char by default, so the warning 
("conversion to 'char' from 'char8_t' may change the sign of the 
result") isn't emitted on those platforms.  I presume adding 
'-fsigned-char' to the options for the test would be a sufficient fix?  
If so, a patch is attached.


Tom.



Christophe



Index: gcc/testsuite/g++.dg/ext/utf-cvt-char8_t.C
===
--- gcc/testsuite/g++.dg/ext/utf-cvt-char8_t.C	(revision 267930)
+++ gcc/testsuite/g++.dg/ext/utf-cvt-char8_t.C	(working copy)
@@ -1,7 +1,7 @@
 /* Contributed by Kris Van Hees  */
 /* Test the char8_t promotion rules. */
 /* { dg-do compile { target c++11 } } */
-/* { dg-options "-fchar8_t -Wall -Wconversion -Wsign-conversion -Wsign-promo" } */
+/* { dg-options "-fchar8_t -fsigned-char -Wall -Wconversion -Wsign-conversion -Wsign-promo" } */
 
 extern void f_c (char);
 extern void fsc (signed char);


Re: [REVISED PATCH 5/9]: C++ P0482R5 char8_t: Standard library support

2019-02-07 Thread Tom Honermann

On 2/7/19 4:44 AM, Jonathan Wakely wrote:

On 23/12/18 21:27 -0500, Tom Honermann wrote:
Attached is a revised patch that addresses changes in P0482R6.  
Changes from the prior patch include:

- Updated the value of the __cpp_char8_t feature test macro to 201811.

Tested on x86_64-linux.


Thanks, Tom, this is great work!

The front-end changes for char8_t went in recently, and I'm finally
ready to commit the library parts.

Great!

There's one big problem I found in
this patch, which is that the new numeric_limits
specialization uses constexpr unconditionally. That fails if 
is compiled using options like -std=c++98 -fno-char8_t because the
specialization will be used, but the constexpr keyword isn't allowed.
That's easily fixed by replacing the keyword with _GLIBCXX_CONSTEXPR.
Hmm, the code for the char8_t specialization was copied from the 
char16_t specialization which also uses constexpr unconditionally (but 
is guarded by a C++11+ requirement).  The char8_t specialization must be 
elided when the compiler is invoked with -std=c++98 -fno-char8_t (since 
the char8_t type doesn't exist then).  The _GLIBCXX_USE_CHAR8_T guard 
doesn't suffice for this? _GLIBCXX_USE_CHAR8_T should only be defined if 
__cpp_char8_t is defined; and that should only be defined if -fchar8_t 
or -std=c++2a is specified.  Or perhaps you intended -std=c++98 
-fchar8_t?  I agree in that case that use of _GLIBCXX_CONSTEXPR is 
necessary.


The other way to solve that problem would be for the compiler to give
an error if -fchar8_t is used with C++98, but I see no fundamental
reason that combination of options shouldn't be allowed. We can
support it in the library by using the macro.

Agreed.


As discussed in San Diego, the other change needed is to add the
abi_tag attribute to the new versions of path::u8string and
path::generic_u8string, so that the mangling is different when its
return type is different:

#ifdef _GLIBCXX_USE_CHAR8_T
   __attribute__((__abi_tag__("__u8")))
   std::u8string  u8string() const;
#else
   std::string    u8string() const;
#endif // _GLIBCXX_USE_CHAR8_T

Otherwise we get ODR violations when linking objects compiled
with -fchar8_t enabled to objects with it disabled (e.g. linking
-std=c++17 objects to -std=c++2a objects, which needs to work).


Are ODR violations bad? :)



I suggest "__u8" as the name of the ABI tag, but I'm open to other
suggestions. "__char8_t" is a bit long and verbose. "__cxx20" would be
consistent with "__cxx11" used for the new ABI introduced in GCC 5 but
it regularly confuses people who think it is coupled to the -std=c++11
option (and so don't understand why they still see it for -std=c++14).
I have no preference or alternative suggestions here.  Had I recognized 
the issue, I would have asked you what to do about it :)


Also, I see that you've made changes to  (to
add the experimental::u8string_view typedef) and to
std::experimental::path (to change the return type of u8string and
generic_u8string).

The former change is fairly harmless; it only adds a typedef, albeit
one which is not a reserved name in C++14/C++17 and so should be
available for users to define as a macro. Maybe prior to C++2a we
should only define it when GNU extensions are enabled (i.e. when using
-std=gnu++14 not -std=c++14):

#if defined _GLIBCXX_USE_CHAR8_T \
 && (__cplusplus > 201703L || !defined __STRICT_ANSI__)
 using u8string_view = basic_string_view;
#endif

That makes sense.


Changing the return type of experimental::path members concerns me
more. That's a published TS which is not going to be revised, and it's
not obvious to me that users would want the change in semantics. If
somebody is still using the Filesystem TS in C++2a code, they're
probably not expecting it to change. If they need to update their code
for C++2a they might as well just use std::filesystem, and so having
char8_t support in std::experimental::filesystem isn't clearly useful.

I agree.  I added the support to the experimental implementations more 
out of a desire to be complete and to remove any potential barriers to 
use of -fchar8_t than because I felt the changes were really necessary.  
I would be perfectly fine with skipping the updates to the experimental 
libraries completely.


Tom.



Re: [REVISED PATCH 7/9]: C++ P0482R5 char8_t: New standard library tests

2019-02-07 Thread Tom Honermann

On 2/7/19 4:54 AM, Jonathan Wakely wrote:

On 23/12/18 21:27 -0500, Tom Honermann wrote:
Attached is a revised patch that addresses changes in P0482R6.  
Changes from the prior patch include:

- Updated the value of the __cpp_char8_t feature test macro to 201811.

Tested on x86_64-linux.


There are quite a few additional changes needed to make the testsuite
pass cleanly with non-default options, e.g. when running it with
RUNTESTFLAGS=--target_board=unix/-fchar8_t/-fno-inline I see these
failures:
I remember thinking that I had to deal with this at one point.  It seems 
I then forgot about it.


FAIL: 21_strings/basic_string/literals/types.cc (test for excess errors)
FAIL: 21_strings/basic_string/literals/values.cc (test for excess errors)
UNRESOLVED: 21_strings/basic_string/literals/values.cc compilation 
failed to produce executable
FAIL: 21_strings/basic_string_view/literals/types.cc (test for excess 
errors)
FAIL: 21_strings/basic_string_view/literals/values.cc (test for excess 
errors)
UNRESOLVED: 21_strings/basic_string_view/literals/values.cc 
compilation failed to produce executable

FAIL: 22_locale/codecvt/char16_t.cc (test for excess errors)
UNRESOLVED: 22_locale/codecvt/char16_t.cc compilation failed to 
produce executable

FAIL: 22_locale/codecvt/char32_t.cc (test for excess errors)
UNRESOLVED: 22_locale/codecvt/char32_t.cc compilation failed to 
produce executable

FAIL: 22_locale/codecvt/codecvt_utf8/79980.cc (test for excess errors)
UNRESOLVED: 22_locale/codecvt/codecvt_utf8/79980.cc compilation failed 
to produce executable
FAIL: 22_locale/codecvt/codecvt_utf8/wchar_t/1.cc (test for excess 
errors)
UNRESOLVED: 22_locale/codecvt/codecvt_utf8/wchar_t/1.cc compilation 
failed to produce executable

FAIL: 22_locale/codecvt/utf8.cc (test for excess errors)
UNRESOLVED: 22_locale/codecvt/utf8.cc compilation failed to produce 
executable

FAIL: 22_locale/conversions/string/2.cc (test for excess errors)
UNRESOLVED: 22_locale/conversions/string/2.cc compilation failed to 
produce executable

FAIL: 22_locale/conversions/string/3.cc (test for excess errors)
UNRESOLVED: 22_locale/conversions/string/3.cc compilation failed to 
produce executable

FAIL: experimental/string_view/literals/types.cc (test for excess errors)
FAIL: experimental/string_view/literals/values.cc (test for excess 
errors)
UNRESOLVED: experimental/string_view/literals/values.cc compilation 
failed to produce executable


There would be similar errors running all the tests with -std=c++2a,
which is definitely something I do often and so want the tests to be
clean.

Absolutely, agreed.

We can either disable those tests when char8_t is enabled
(because we already have alternative tests checking the char8_t
versions of string_view etc.) or make them work either way, which the
attached patch begins doing (more changes are needed).
Since most of these tests exercise functionality that is not u8/char8_t 
specific, I think we should make them work.


I expect a different set of failures for -fno-char8_t (which is
probably a less important case to support that enabling char8_t in
older standards, but maybe still worth testing now and then).

I'm not sure it is less important.  -fno-char8_t may be an important 
tool for some code bases during their initial testing of, and migration 
to, C++20.


Tom.



[PATCH 0/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics

2022-07-23 Thread Tom Honermann via Gcc-patches
This change addresses the following issue raised on the libc-alpha mailing list:
  https://sourceware.org/pipermail/libc-alpha/2022-July/140825.html
Glibc 2.36 adds a char8_t typedef in C++ modes that do not enable the char8_t
builtin type (C++17 and earlier by default; subject to _GNU_SOURCE and use of
the -f[no-]char8_t option).  When -Wc++20-compat diagnostics are enabled, the
following warning is issued from the glibc uchar.h header.
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]
Such diagnostics are not desired from system headers, so glibc would like to
suppress the diagnostic using '#pragma GCC diagnostic ignored "-Wc++20-compat"',
but attempting to do so currently fails.  This patch corrects that.

Tom Honermann (1):
  c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

 gcc/c-family/c-opts.cc |  7 +++
 gcc/c-family/c.opt |  2 +-
 gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
 libcpp/include/cpplib.h|  4 
 libcpp/init.cc |  1 +
 6 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

-- 
2.32.0



[PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-07-23 Thread Tom Honermann via Gcc-patches
Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following diagnostic
otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

Fixes https://gcc.gnu.org/PR106423.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics
in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.

gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.
---
 gcc/c-family/c-opts.cc |  7 +++
 gcc/c-family/c.opt |  2 +-
 gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
 libcpp/include/cpplib.h|  4 
 libcpp/init.cc |  1 +
 6 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..1ea37ba9742 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename)
   else if (warn_narrowing == -1)
 warn_narrowing = 0;
 
+  if (cxx_dialect >= cxx20)
+{
+  /* Don't warn about C++20 compatibility changes in C++20 or later.  */
+  warn_cxx20_compat = 0;
+  cpp_opts->cpp_warn_cxx20_compat = 0;
+}
+
   /* C++17 has stricter evaluation order requirements; let's use some of them
  for earlier C++ as well, so chaining works as expected.  */
   if (c_dialect_cxx ()
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 44e1a60ce24..dfdebd596ef 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -455,7 +455,7 @@ Wc++2a-compat
 C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented
 
 Wc++20-compat
-C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
+C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT)
 Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO 
C++ 2020.
 
 Wc++11-extensions
diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C 
b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
new file mode 100644
index 000..d67d01e31ed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++98_only } }
+// { dg-options "-Wc++11-compat" }
+
+// Validate suppression of -Wc++11-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++11-compat"
+int alignof;
+int alignas;
+int constexpr;
+int decltype;
+int noexcept;
+int nullptr;
+int static_assert;
+int thread_local;
+int _Alignas;
+int _Alignof;
+int _Thread_local;
diff --git a/gcc/testsuite/g++.dg/cpp2a/keywords2.C 
b/gcc/testsuite/g++.dg/cpp2a/keywords2.C
new file mode 100644
index 000..8714a7b26b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/keywords2.C
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++17_down } }
+// { dg-options "-Wc++20-compat" }
+
+// Validate suppression of -Wc++20-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++20-compat"
+int constinit;
+int consteval;
+int requires;
+int concept;
+int co_await;
+int co_yield;
+int co_return;
+int char8_t;
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 3eba6f74b57..9d90c18e4f2 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -547,6 +547,9 @@ struct cpp_options
   /* True if warn about differences between C++98 and C++11.  */
   bool cpp_warn_cxx11_compat;
 
+  /* True if warn about differences between C++17 and C++20.  */
+  bool cpp_warn_cxx20_compat;
+
   /* Nonzero if bidirectional control characters checking is on.  See enum
  cpp_bidirectional_level.  */
   unsigned char cpp_warn_bidirectional;
@@ -655,6 +658,7 @@ enum cpp_warning_reason {
   CPP_W_C90_C99_COMPAT,
   CPP_W_C11_C2X_COMPAT,
   CPP_W_CXX11_COMPAT,
+  

[PATCH 0/3] Implement C2X N2653 (char8_t) and correct UTF-8 character literal type in preprocessor directives for C++

2022-07-25 Thread Tom Honermann via Gcc-patches
This patch series provides an implementation and tests for the WG14 N2653
paper as adopted for C2X.

Additionally, a fix is included for the C++ preprocessor to treat UTF-8
character literals in preprocessor directives as an unsigned type in char8_t
enabled modes (in C++17 and earlier with -fchar8_t or in C++20 or later
without -fno-char8_t).

Tom Honermann (3):
  C: Implement C2X N2653 char8_t and UTF-8 string literal changes
  testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal
changes
  c++/106426: Treat u8 character literals as unsigned in char8_t modes.

 gcc/c-family/c-lex.cc | 13 --
 gcc/c-family/c-opts.cc|  5 ++-
 gcc/c/c-parser.cc | 16 ++-
 gcc/c/c-typeck.cc |  2 +-
 gcc/ginclude/stdatomic.h  |  8 
 .../g++.dg/ext/char8_t-char-literal-1.C   |  6 ++-
 .../g++.dg/ext/char8_t-char-literal-2.C   |  4 ++
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c2x-predefined-macros.c  | 11 +
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 .../gcc.dg/gnu2x-predefined-macros.c  |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 libcpp/charset.cc |  4 +-
 libcpp/include/cpplib.h   |  4 +-
 libcpp/init.cc|  1 +
 18 files changed, 191 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

-- 
2.32.0



[PATCH 3/3] c++/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-07-25 Thread Tom Honermann via Gcc-patches
This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

Fixes https://gcc.gnu.org/PR106426.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char
subject to -fchar8_t, -fsigned-char, and/or -funsigned-char.

gcc/testsuite/ChangeLog:
* g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals.
* g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals.

libcpp/ChangeLog:
* charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR
literals based on unsigned_utf8char.
* include/cpplib.h (cpp_options): Add unsigned_utf8char.
* init.cc (cpp_create_reader): Initialize unsigned_utf8char.
---
 gcc/c-family/c-opts.cc| 1 +
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C | 6 +-
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C | 4 
 libcpp/charset.cc | 4 ++--
 libcpp/include/cpplib.h   | 4 ++--
 libcpp/init.cc| 1 +
 6 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index 108adc5caf8..02ce1e86cdb 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1062,6 +1062,7 @@ c_common_post_options (const char **pfilename)
   /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
 flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
+  cpp_opts->unsigned_utf8char = flag_char8_t ? 1 : cpp_opts->unsigned_char;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
index 8ed85ccfdcd..2994dd38516 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
@@ -1,6 +1,6 @@
 // Test that UTF-8 character literals have type char if -fchar8_t is not 
enabled.
 // { dg-do compile }
-// { dg-options "-std=c++17 -fno-char8_t" }
+// { dg-options "-std=c++17 -fsigned-char -fno-char8_t" }
 
 template
   struct is_same
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 > 0
+#error "UTF-8 character literals not signed in preprocessor"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
index 7861736689c..db4fe70046d 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 < 0
+#error "UTF-8 character literals not unsigned in preprocessor"
+#endif
diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index ca8b7cf7aa5..12e31632228 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -1960,8 +1960,8 @@ narrow_str_to_charconst (cpp_reader *pfile, cpp_string 
str,
   /* Multichar constants are of type int and therefore signed.  */
   if (i > 1)
 unsigned_p = 0;
-  else if (type == CPP_UTF8CHAR && !CPP_OPTION (pfile, cplusplus))
-unsigned_p = 1;
+  else if (type == CPP_UTF8CHAR)
+unsigned_p = CPP_OPTION (pfile, unsigned_utf8char);
   else
 unsigned_p = CPP_OPTION (pfile, unsigned_char);
 
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 3eba6f74b57..f9c042db034 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -581,8 +581,8 @@ struct cpp_options
  ints and target wide characters, respectively.  */
   size_t precision, char_precision, int_precision, wchar_precision;
 
-  /* True means chars (wide chars) are unsigned.  */
-  bool unsigned_char, unsigned_wchar;
+  /* True means chars (wide chars, UTF-8 chars) are unsigned.  */
+  bool unsigned_char, unsigned_wchar, unsigned_utf8char;
 
   /* True if the most significant byte in a word has the lowest
  address in memory.  */
diff --git a/libcpp/init.cc b/libcpp/init.cc
index f4ab83d2145..0242da5f55c 100644
--- a/libcpp/init.cc
+++ b/libcpp/init.cc
@@ -231,6 +231,7 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int);
   CPP_OPTION (pfile, unsigned_char) = 0;
   CPP_OPTION (pfile, unsigned_wchar) = 1;
+  CPP_OPTION (pfile, unsigned_utf8char) = 1;
   CPP_OPTION (pfile, bytes_big_endian) = 1;  /* does not matter */
 
   /* Default to no charset conversion.  */
-- 
2.32.0



[PATCH 1/3] C: Implement C2X N2653 char8_t and UTF-8 string literal changes

2022-07-25 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library
changes adopted for C2X via WG14 N2653.  The changes include:
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

gcc/ChangeLog:

* ginclude/stdatomic.h (atomic_char8_t,
ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

* c-parser.c (c_parser_string_literal): Use char8_t as the type
of CPP_UTF8STRING when char8_t support is enabled.
* c-typeck.c (digest_init): Allow initialization of an array
of character type by a string literal with type array of
char8_t.

gcc/c-family/ChangeLog:

* c-lex.c (lex_string, lex_charconst): Use char8_t as the type
of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
enabled.
* c-opts.c (c_common_post_options): Set flag_char8_t if
targeting C2x.
---
 gcc/c-family/c-lex.cc| 13 +
 gcc/c-family/c-opts.cc   |  4 ++--
 gcc/c/c-parser.cc| 16 ++--
 gcc/c/c-typeck.cc|  2 +-
 gcc/ginclude/stdatomic.h |  8 
 5 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index 8bfa4f4024f..0b6f94e18a8 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool 
objc_string, bool translate)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -1425,9 +1432,7 @@ lex_charconst (const cpp_token *token)
 type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
 {
-  if (!c_dialect_cxx ())
-   type = unsigned_char_type_node;
-  else if (flag_char8_t)
+  if (flag_char8_t)
 type = char8_type_node;
   else
 type = char_type_node;
diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..108adc5caf8 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1059,9 +1059,9 @@ c_common_post_options (const char **pfilename)
   if (flag_sized_deallocation == -1)
 flag_sized_deallocation = (cxx_dialect >= cxx14);
 
-  /* char8_t support is new in C++20.  */
+  /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
-flag_char8_t = (cxx_dialect >= cxx20);
+flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 92049d1a101..fa9395986de 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -7447,7 +7447,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -7472,9 +7479,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
 {
 default:
 case CPP_STRING:
-case CPP_UTF8STRING:
   TREE_TYPE (value) = char_array_type_node;
   break;
+case CPP_UTF8STRING:
+  if (flag_char8_t)
+   TREE_TYPE (value) = char8_array_type_node;
+  else
+   TREE_TYPE (value) = char_array_type_node;
+  break;
 case CPP_STRING16:
   TREE_TYPE (value) = char16_array_type_node;
   break;
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index fd0a7f81a7a..231f4e980b6 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -8045,7 +8045,7 @@ digest_init (location_t init_loc, tree type, tree init, 
tree origtype,
 
  if (char_array)
{
- if (typ2 != char_type_node)
+ if (typ2 != char_type_node && typ2 != char8_type_node)
incompat_string_cst = true;
}
  else if (!comptypes (typ1, typ2))
diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h
index bfcfdf664c7..75ed7965689 100644

[PATCH 2/3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-07-25 Thread Tom Honermann via Gcc-patches
This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

gcc/testsuite/ChangeLog:
* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c2x-predefined-macros.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-predefined-macros.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.
---
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c2x-predefined-macros.c  | 11 +
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 .../gcc.dg/gnu2x-predefined-macros.c  |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 8 files changed, 142 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..37ea4c8926c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)  \
+  do   \
+{  \
+  int r1 = MACRO;  \
+  int r2 = atomic_is_lock_free (&V1);  \
+  int r3 = atomic_is_lock_free (&V2);  \
+  if (r1 != 0 && r1 != 1 && r1 != 2)   \
+   abort ();   \
+  if (r2 != 0 && r2 != 1)  \
+   abort ();   \
+  if (r3 != 0 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r2 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 0 && r2 != 0)  \
+   abort ();   \
+  if (r1 == 0 && r3 != 0)  \
+   abort ();   \
+}  \
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..a017b134817
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */
+
+#include "c2x-stdatomic-lockfree-char8_t.c"
diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c 
b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
new file mode 100644
index 000..3456105563a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
@@ -0,0 +1,11 @@
+/* Test C2X predefined macros.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+#if !defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is not defined!
+#endif
+
+#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
new file mode 100644
index 000..1ae86955516
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C2X UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, unsigned char*: 2) == 2, "UTF-8 
string literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, unsigned char:  2) == 2, "UTF-8 
string literal elements have an unexpected type");
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str.c 
b/gcc/testsuite/gcc.dg/c2x-utf

Re: [PATCH 3/3 v2] preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-07-25 Thread Tom Honermann via Gcc-patches
This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

PR preprocessor/106426

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char
subject to -fchar8_t, -fsigned-char, and/or -funsigned-char.

gcc/testsuite/ChangeLog:
* g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals.
* g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals.

libcpp/ChangeLog:
* charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR
literals based on unsigned_utf8char.
* include/cpplib.h (cpp_options): Add unsigned_utf8char.
* init.cc (cpp_create_reader): Initialize unsigned_utf8char.
---
 gcc/c-family/c-opts.cc| 1 +
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C | 6 +-
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C | 4 
 libcpp/charset.cc | 4 ++--
 libcpp/include/cpplib.h   | 4 ++--
 libcpp/init.cc| 1 +
 6 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index 108adc5caf8..02ce1e86cdb 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1062,6 +1062,7 @@ c_common_post_options (const char **pfilename)
   /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
 flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
+  cpp_opts->unsigned_utf8char = flag_char8_t ? 1 : cpp_opts->unsigned_char;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
index 8ed85ccfdcd..2994dd38516 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
@@ -1,6 +1,6 @@
 // Test that UTF-8 character literals have type char if -fchar8_t is not 
enabled.
 // { dg-do compile }
-// { dg-options "-std=c++17 -fno-char8_t" }
+// { dg-options "-std=c++17 -fsigned-char -fno-char8_t" }
 
 template
   struct is_same
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 > 0
+#error "UTF-8 character literals not signed in preprocessor"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
index 7861736689c..db4fe70046d 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 < 0
+#error "UTF-8 character literals not unsigned in preprocessor"
+#endif
diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index ca8b7cf7aa5..12e31632228 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -1960,8 +1960,8 @@ narrow_str_to_charconst (cpp_reader *pfile, cpp_string 
str,
   /* Multichar constants are of type int and therefore signed.  */
   if (i > 1)
 unsigned_p = 0;
-  else if (type == CPP_UTF8CHAR && !CPP_OPTION (pfile, cplusplus))
-unsigned_p = 1;
+  else if (type == CPP_UTF8CHAR)
+unsigned_p = CPP_OPTION (pfile, unsigned_utf8char);
   else
 unsigned_p = CPP_OPTION (pfile, unsigned_char);
 
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 3eba6f74b57..f9c042db034 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -581,8 +581,8 @@ struct cpp_options
  ints and target wide characters, respectively.  */
   size_t precision, char_precision, int_precision, wchar_precision;
 
-  /* True means chars (wide chars) are unsigned.  */
-  bool unsigned_char, unsigned_wchar;
+  /* True means chars (wide chars, UTF-8 chars) are unsigned.  */
+  bool unsigned_char, unsigned_wchar, unsigned_utf8char;
 
   /* True if the most significant byte in a word has the lowest
  address in memory.  */
diff --git a/libcpp/init.cc b/libcpp/init.cc
index f4ab83d2145..0242da5f55c 100644
--- a/libcpp/init.cc
+++ b/libcpp/init.cc
@@ -231,6 +231,7 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int);
   CPP_OPTION (pfile, unsigned_char) = 0;
   CPP_OPTION (pfile, unsigned_wchar) = 1;
+  CPP_OPTION (pfile, unsigned_utf8char) = 1;
   CPP_OPTION (pfile, bytes_big_endian) = 1;  /* does not matter */
 
   /* Default to no charset conversion.  */
-- 
2.32.0



Re: [PATCH 3/3] c++/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-07-25 Thread Tom Honermann via Gcc-patches

On 7/25/22 2:05 PM, Andrew Pinski wrote:

On Mon, Jul 25, 2022 at 11:01 AM Tom Honermann via Gcc-patches
 wrote:

This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

Fixes https://gcc.gnu.org/PR106426.

The above mention of the PR # should just be:
preprocessor/106426

And then when this patch gets committed, it will be recorded in bugzilla also.


Thank you. I resent the patch with a revised subject line and commit 
message to reflect the component change in Bugzilla.


Tom.



Thanks,
Andrew Pinski



Re: [PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-07-30 Thread Tom Honermann via Gcc-patches

On 7/27/22 7:09 PM, Joseph Myers wrote:

On Sun, 24 Jul 2022, Tom Honermann via Gcc-patches wrote:


Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also

There are lots of C++ warning options, all of which should support pragma
suppression regardless of whether they are relevant to the preprocessor or
not.  Do they all need this kind of handling, or is it only -Wc++20-compat
that has some kind of problem?


I had only checked -Wc++20-compat when working on the patch.

I did some spot checking now and confirmed that suppression works as 
expected for C++ for at least the following warnings:

  -Wuninitialized
  -Warray-compare
  -Wbool-compare
  -Wtautological-compare
  -Wterminate

I don't know the diagnostic framework well. As best I can tell, this 
issue is specific to the -Wc++20-compat option and when the particular 
diagnostic is issued (e.g., during lexing as opposed to during parsing). 
The following call chains appear to be relevant.
  cp_lexer_new_main -> cp_lexer_handle_early_pragma -> 
c_invoke_early_pragma_handler

  cp_parser_* -> cp_parser_pragma -> c_invoke_pragma_handler
  (where * might be "declaration", "toplevel_declaration", 
"class_head", "objc_interstitial_code", ...)


The -Wc++20-compat enabled warning regarding new keywords in C++20 is 
issued from cp_lexer_get_preprocessor_token.


Tom.



Re: [PATCH 1/3] C: Implement C2X N2653 char8_t and UTF-8 string literal changes

2022-07-30 Thread Tom Honermann via Gcc-patches

On 7/27/22 7:20 PM, Joseph Myers wrote:

On Mon, 25 Jul 2022, Tom Honermann via Gcc-patches wrote:


diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h
index bfcfdf664c7..75ed7965689 100644
--- a/gcc/ginclude/stdatomic.h
+++ b/gcc/ginclude/stdatomic.h
@@ -49,6 +49,10 @@ typedef _Atomic long atomic_long;
  typedef _Atomic unsigned long atomic_ulong;
  typedef _Atomic long long atomic_llong;
  typedef _Atomic unsigned long long atomic_ullong;
+#if (defined(__CHAR8_TYPE__) \
+ && (defined(_GNU_SOURCE) || defined(_ISOC2X_SOURCE)))
+typedef _Atomic __CHAR8_TYPE__ atomic_char8_t;
+#endif
  typedef _Atomic __CHAR16_TYPE__ atomic_char16_t;
  typedef _Atomic __CHAR32_TYPE__ atomic_char32_t;
  typedef _Atomic __WCHAR_TYPE__ atomic_wchar_t;

GCC headers don't test glibc feature test macros such as _GNU_SOURCE and
_ISOC2X_SOURCE; they base things only on the standard version (whether
directly, or indirectly as via __CHAR8_TYPE__) and standard-defined
feature test macros.


Ok, thank you, that makes sense. I'll follow up with a revised patch 
that removes the additional conditions.


Tom.



(There's one exception in glimits.h - testing __USE_GNU, the macro defined
internally by glibc's headers - but I don't think that's something we want
to emulate in new code.)



Re: [PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-07-31 Thread Tom Honermann via Gcc-patches

On 7/31/22 11:05 AM, Lewis Hyatt wrote:

On Sat, Jul 30, 2022 at 7:06 PM Tom Honermann via Gcc-patches
  wrote:

On 7/27/22 7:09 PM, Joseph Myers wrote:

On Sun, 24 Jul 2022, Tom Honermann via Gcc-patches wrote:


Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also

There are lots of C++ warning options, all of which should support pragma
suppression regardless of whether they are relevant to the preprocessor or
not.  Do they all need this kind of handling, or is it only -Wc++20-compat
that has some kind of problem?

I had only checked -Wc++20-compat when working on the patch.

I did some spot checking now and confirmed that suppression works as
expected for C++ for at least the following warnings:
-Wuninitialized
-Warray-compare
-Wbool-compare
-Wtautological-compare
-Wterminate

I don't know the diagnostic framework well. As best I can tell, this
issue is specific to the -Wc++20-compat option and when the particular
diagnostic is issued (e.g., during lexing as opposed to during parsing).
The following call chains appear to be relevant.
cp_lexer_new_main -> cp_lexer_handle_early_pragma ->
c_invoke_early_pragma_handler
cp_parser_* -> cp_parser_pragma -> c_invoke_pragma_handler
(where * might be "declaration", "toplevel_declaration",
"class_head", "objc_interstitial_code", ...)

The -Wc++20-compat enabled warning regarding new keywords in C++20 is
issued from cp_lexer_get_preprocessor_token.

Tom.


I have been working on improving the handling of "#pragma GCC
diagnostic" lately. The behavior for C++ changed since r13-1544
(https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e46f4d7430c5210465791603735ab219ef263c51).
I have some more comments about the patch's approach on the PR
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c44).

"#pragma GCC diagnostic" formerly did not work in C++ at all, for
diagnostics generated by libcpp, because C++ obtains all the tokens
from libcpp first (including deferred pragmas), and then processes
them afterward, too late to take effect for diagnostics that libcpp
has already emitted. r13-1544 fixed this up by adding an early pragma
handler, which runs as soon as a deferred pragma token is seen and
handles diagnostic pragmas if they pertain to libcpp-controlled
diagnostics. Non-libcpp diagnostics still need to be handled later,
during parsing, or else they get processed too early and it leads to
other problems. Basically, now each diagnostic pragma is handled as
close in time as possible to the time the associated diagnostics might
be generated.

The early pragma handler determines that an option comes from libcpp,
and so should be subject to early processing, if it was marked as such
in the options definition file. Tom's patch points out that
-Wc++20-compat needs to be handled early, and so marking it as a
libcpp diagnostic in c-family/c.opt arranges for that to work as
intended. Now one potential objection here is that -Wc++20-compat
warnings are not technically generated by libcpp. They are generated
by the C++ frontend immediately after lexing an identifier token from
libcpp (cp_lexer_get_preprocessor_token()). But the distinction
between these two steps is rather blurry and it seems logical to me,
to denote this as a libcpp-related option. Also, the same is already
done for -Wc++11-compat. Otherwise, we would need to add some new
option property to indicate which ones need to be handled for pragmas
at lexing time rather than parsing time.

At the moment I don't see any other diagnostics issued from
cp_lexer_get_preprocessor_token() that would need similar adjustments.
Assuming the approach is OK, it might be nice to add a comment to that
function, indicating that any diagnostics emitted there should be
annotated as libcpp options in the .opt file?


Thank you for those details; I wasn't aware of that history.

If I'm interpreting your response correctly, it sounds like you agree 
with the direction of the patch.


If you like, I can add a comment as you suggested and re-post the patch. 
Perhaps:


diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4f67441eeb1..c3584446827 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -924,7 +924,10 @@cp_lexer_saving_tokens (const cp_lexer* lexer)
/* Store the next token from the preprocessor in *TOKEN.  Return true
   if we reach EOF.  If LEXER is NULL, assume we are handling an
   initial #pragma pch_preprocess, and thus want the lexer to return
-   processed strings.  */
+   processed strin

Re: [PATCH 2/3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-07-31 Thread Tom Honermann via Gcc-patches

On 7/27/22 7:23 PM, Joseph Myers wrote:

On Mon, 25 Jul 2022, Tom Honermann via Gcc-patches wrote:


This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

I'd expect this patch also to add tests verifying that u8"" strings have
the old type for C11 (unless there are existing such tests, but I don't
see them).

Agreed, good catch. thank you.



diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..37ea4c8926c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */

I don't think _ISOC2X_SOURCE belongs in any GCC tests.
That was necessary because the first patch in this series omitted the 
atomic_char8_t and ATOMIC_CHAR8_T_LOCK_FREE definitions unless one of 
_GNU_SOURCE or _ISOC2X_SOURCE was defined. Per review of that first 
patch, those conditions will be removed, so there will be no need to 
define them here.



diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..a017b134817
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */

Nor does _GNU_SOURCE (unless the test depends on glibc functionality
that's only available with _GNU_SOURCE, but in that case you also need
some effective-target conditionals to restrict it to appropriate glibc
targets).


Ditto.

I'll post new patches shortly.

Tom.



[PATCH 1/3 v2] C: Implement C2X N2653 char8_t and UTF-8 string literal changes

2022-08-01 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library
changes adopted for C2X via WG14 N2653.  The changes include:
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

gcc/ChangeLog:

* ginclude/stdatomic.h (atomic_char8_t,
ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

* c-parser.c (c_parser_string_literal): Use char8_t as the type
of CPP_UTF8STRING when char8_t support is enabled.
* c-typeck.c (digest_init): Allow initialization of an array
of character type by a string literal with type array of
char8_t.

gcc/c-family/ChangeLog:

* c-lex.c (lex_string, lex_charconst): Use char8_t as the type
of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
enabled.
* c-opts.c (c_common_post_options): Set flag_char8_t if
targeting C2x.
---
 gcc/c-family/c-lex.cc| 13 +
 gcc/c-family/c-opts.cc   |  4 ++--
 gcc/c/c-parser.cc| 16 ++--
 gcc/c/c-typeck.cc|  2 +-
 gcc/ginclude/stdatomic.h |  6 ++
 5 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index 8bfa4f4024f..0b6f94e18a8 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool 
objc_string, bool translate)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -1425,9 +1432,7 @@ lex_charconst (const cpp_token *token)
 type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
 {
-  if (!c_dialect_cxx ())
-   type = unsigned_char_type_node;
-  else if (flag_char8_t)
+  if (flag_char8_t)
 type = char8_type_node;
   else
 type = char_type_node;
diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..108adc5caf8 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1059,9 +1059,9 @@ c_common_post_options (const char **pfilename)
   if (flag_sized_deallocation == -1)
 flag_sized_deallocation = (cxx_dialect >= cxx14);
 
-  /* char8_t support is new in C++20.  */
+  /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
-flag_char8_t = (cxx_dialect >= cxx20);
+flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 92049d1a101..fa9395986de 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -7447,7 +7447,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -7472,9 +7479,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
 {
 default:
 case CPP_STRING:
-case CPP_UTF8STRING:
   TREE_TYPE (value) = char_array_type_node;
   break;
+case CPP_UTF8STRING:
+  if (flag_char8_t)
+   TREE_TYPE (value) = char8_array_type_node;
+  else
+   TREE_TYPE (value) = char_array_type_node;
+  break;
 case CPP_STRING16:
   TREE_TYPE (value) = char16_array_type_node;
   break;
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index fd0a7f81a7a..231f4e980b6 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -8045,7 +8045,7 @@ digest_init (location_t init_loc, tree type, tree init, 
tree origtype,
 
  if (char_array)
{
- if (typ2 != char_type_node)
+ if (typ2 != char_type_node && typ2 != char8_type_node)
incompat_string_cst = true;
}
  else if (!comptypes (typ1, typ2))
diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h
index bfcfdf664c7..9f2475b739d 100644
--

[PATCH 2/3 v2] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-08-01 Thread Tom Honermann via Gcc-patches
This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

gcc/testsuite/ChangeLog:
* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c2x-predefined-macros.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-predefined-macros.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.
---
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c11-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c17-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-predefined-macros.c  | 11 +
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 .../gcc.dg/gnu2x-predefined-macros.c  |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 10 files changed, 154 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..1b692f55ed0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)  \
+  do   \
+{  \
+  int r1 = MACRO;  \
+  int r2 = atomic_is_lock_free (&V1);  \
+  int r3 = atomic_is_lock_free (&V2);  \
+  if (r1 != 0 && r1 != 1 && r1 != 2)   \
+   abort ();   \
+  if (r2 != 0 && r2 != 1)  \
+   abort ();   \
+  if (r3 != 0 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r2 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 0 && r2 != 0)  \
+   abort ();   \
+  if (r1 == 0 && r3 != 0)  \
+   abort ();   \
+}  \
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..27a3cfe3552
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -pedantic-errors" } */
+
+#include "c2x-stdatomic-lockfree-char8_t.c"
diff --git a/gcc/testsuite/gcc.dg/c11-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c11-utf8str-type.c
new file mode 100644
index 000..8be9abb9686
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C11 UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string 
literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, default: 2) == 1, "UTF-8 string 
literal elements have an unexpected type");
diff --git a/gcc/testsuite/gcc.dg/c17-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c17-utf8str-type.c
new file mode 100644
index 000..515c6db3970
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c17-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C17 UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c17" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string 
literals 

[PATCH 1/1 v2] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-08-01 Thread Tom Honermann via Gcc-patches
Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following diagnostic
otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

Fixes https://gcc.gnu.org/PR106423.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics
in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.

gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.
---
 gcc/c-family/c-opts.cc |  7 +++
 gcc/c-family/c.opt |  2 +-
 gcc/cp/parser.cc   |  5 -
 gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
 libcpp/include/cpplib.h|  4 
 libcpp/init.cc |  1 +
 7 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..1ea37ba9742 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename)
   else if (warn_narrowing == -1)
 warn_narrowing = 0;
 
+  if (cxx_dialect >= cxx20)
+{
+  /* Don't warn about C++20 compatibility changes in C++20 or later.  */
+  warn_cxx20_compat = 0;
+  cpp_opts->cpp_warn_cxx20_compat = 0;
+}
+
   /* C++17 has stricter evaluation order requirements; let's use some of them
  for earlier C++ as well, so chaining works as expected.  */
   if (c_dialect_cxx ()
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 44e1a60ce24..dfdebd596ef 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -455,7 +455,7 @@ Wc++2a-compat
 C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented
 
 Wc++20-compat
-C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
+C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT)
 Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO 
C++ 2020.
 
 Wc++11-extensions
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4f67441eeb1..c3584446827 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -924,7 +924,10 @@ cp_lexer_saving_tokens (const cp_lexer* lexer)
 /* Store the next token from the preprocessor in *TOKEN.  Return true
if we reach EOF.  If LEXER is NULL, assume we are handling an
initial #pragma pch_preprocess, and thus want the lexer to return
-   processed strings.  */
+   processed strings.
+
+   Diagnostics issued from this function must have their controlling option (if
+   any) in c.opt annotated as a libcpp option via the CppReason property.  */
 
 static void
 cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token)
diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C 
b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
new file mode 100644
index 000..d67d01e31ed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++98_only } }
+// { dg-options "-Wc++11-compat" }
+
+// Validate suppression of -Wc++11-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++11-compat"
+int alignof;
+int alignas;
+int constexpr;
+int decltype;
+int noexcept;
+int nullptr;
+int static_assert;
+int thread_local;
+int _Alignas;
+int _Alignof;
+int _Thread_local;
diff --git a/gcc/testsuite/g++.dg/cpp2a/keywords2.C 
b/gcc/testsuite/g++.dg/cpp2a/keywords2.C
new file mode 100644
index 000..8714a7b26b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/keywords2.C
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++17_down } }
+// { dg-options "-Wc++20-compat" }
+
+// Validate suppression of -Wc++20-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++20-compat"
+int constinit;
+int consteval;
+int re

Re: [PATCH 2/3 v2] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-08-01 Thread Tom Honermann via Gcc-patches

On 8/1/22 3:13 PM, Joseph Myers wrote:

On Mon, 1 Aug 2022, Tom Honermann via Gcc-patches wrote:


diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c 
b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
new file mode 100644
index 000..3456105563a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
@@ -0,0 +1,11 @@
+/* Test C2X predefined macros.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+#if !defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is not defined!
+#endif
+
+#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined!
+#endif

These aren't macros defined by C2X.  You could argue that they are part of
the stable interface provided by GCC for e.g. libc implementations to use,
and so should be tested as such, but any such test shouldn't suggest it's
testing a standard feature (and should have a better name to describe what
it's actually testing rather than suggesting it's about predefined macros
in general).

Fair point. This test is redundant anyway; these macros are directly or 
indirectly exercised by the other tests. I'll just remove it.


Tom.



[PATCH 2/3 v3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-08-01 Thread Tom Honermann via Gcc-patches
This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

gcc/testsuite/ChangeLog:
* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c2x-predefined-macros.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-predefined-macros.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.
---
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c11-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c17-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 .../gcc.dg/gnu2x-predefined-macros.c  |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 9 files changed, 143 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..1b692f55ed0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)  \
+  do   \
+{  \
+  int r1 = MACRO;  \
+  int r2 = atomic_is_lock_free (&V1);  \
+  int r3 = atomic_is_lock_free (&V2);  \
+  if (r1 != 0 && r1 != 1 && r1 != 2)   \
+   abort ();   \
+  if (r2 != 0 && r2 != 1)  \
+   abort ();   \
+  if (r3 != 0 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r2 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 0 && r2 != 0)  \
+   abort ();   \
+  if (r1 == 0 && r3 != 0)  \
+   abort ();   \
+}  \
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..27a3cfe3552
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -pedantic-errors" } */
+
+#include "c2x-stdatomic-lockfree-char8_t.c"
diff --git a/gcc/testsuite/gcc.dg/c11-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c11-utf8str-type.c
new file mode 100644
index 000..8be9abb9686
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C11 UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string 
literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, default: 2) == 1, "UTF-8 string 
literal elements have an unexpected type");
diff --git a/gcc/testsuite/gcc.dg/c17-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c17-utf8str-type.c
new file mode 100644
index 000..515c6db3970
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c17-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C17 UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c17" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string 
literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, default: 2) == 1, "UTF-8 string 
literal elements 

Re: [PATCH 2/3 v3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-08-02 Thread Tom Honermann via Gcc-patches

On 8/2/22 12:53 PM, Joseph Myers wrote:

On Mon, 1 Aug 2022, Tom Honermann via Gcc-patches wrote:


This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

Could you please send a complete patch series?  I'm not sure what the
matching patches 1 and 3 are.  Also, I don't generally find it helpful for
tests to be separated from the patch making the changes they test, since
tests are necessary to review of that code.


Absolutely. I'll merge the implementation and test commits, so the next 
series (v4) will have just two commits; one for the C2X N2653 
implementation and the other for the C++ u8 preprocessor string type 
fix. Coming right up.


Tom.



[PATCH v4 0/2] Implement C2X N2653 (char8_t) and correct UTF-8 character literal type in preprocessor directives for C++

2022-08-02 Thread Tom Honermann via Gcc-patches
This patch series provides an implementation and tests for the WG14 N2653
paper as adopted for C2X.

Additionally, a fix is included for the C++ preprocessor to treat UTF-8
character literals in preprocessor directives as an unsigned type in char8_t
enabled modes (in C++17 and earlier with -fchar8_t or in C++20 or later
without -fno-char8_t).

Tom Honermann (2):
  C: Implement C2X N2653 char8_t and UTF-8 string literal changes
  preprocessor/106426: Treat u8 character literals as unsigned in
char8_t modes.

 gcc/c-family/c-lex.cc | 13 --
 gcc/c-family/c-opts.cc|  5 ++-
 gcc/c/c-parser.cc | 16 ++-
 gcc/c/c-typeck.cc |  2 +-
 gcc/ginclude/stdatomic.h  |  6 +++
 .../g++.dg/ext/char8_t-char-literal-1.C   |  6 ++-
 .../g++.dg/ext/char8_t-char-literal-2.C   |  4 ++
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c11-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c17-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 libcpp/charset.cc |  4 +-
 libcpp/include/cpplib.h   |  4 +-
 libcpp/init.cc|  1 +
 18 files changed, 185 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

-- 
2.32.0



[PATCH v4 1/2] C: Implement C2X N2653 char8_t and UTF-8 string literal changes

2022-08-02 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library
changes adopted for C2X via WG14 N2653.  The changes include:
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

gcc/ChangeLog:

* ginclude/stdatomic.h (atomic_char8_t,
ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

* c-parser.c (c_parser_string_literal): Use char8_t as the type
of CPP_UTF8STRING when char8_t support is enabled.
* c-typeck.c (digest_init): Allow initialization of an array
of character type by a string literal with type array of
char8_t.

gcc/c-family/ChangeLog:

* c-lex.c (lex_string, lex_charconst): Use char8_t as the type
of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
enabled.
* c-opts.c (c_common_post_options): Set flag_char8_t if
targeting C2x.

gcc/testsuite/ChangeLog:
* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c11-utf8str-type.c: New test.
* gcc.dg/c17-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.
---
 gcc/c-family/c-lex.cc | 13 --
 gcc/c-family/c-opts.cc|  4 +-
 gcc/c/c-parser.cc | 16 ++-
 gcc/c/c-typeck.cc |  2 +-
 gcc/ginclude/stdatomic.h  |  6 +++
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c11-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c17-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 13 files changed, 170 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index 8bfa4f4024f..0b6f94e18a8 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool 
objc_string, bool translate)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -1425,9 +1432,7 @@ lex_charconst (const cpp_token *token)
 type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
 {
-  if (!c_dialect_cxx ())
-   type = unsigned_char_type_node;
-  else if (flag_char8_t)
+  if (flag_char8_t)
 type = char8_type_node;
   else
 type = char_type_node;
diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..108adc5caf8 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1059,9 +1059,9 @@ c_common_post_options (const char **pfilename)
   if (flag_sized_deallocation == -1)
 flag_sized_deallocation = (cxx_dialect >= cxx14);
 
-  /* char8_t support is new in C++20.  */
+  /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
-flag_char8_t = (cxx_dialect >= cxx20);
+flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 92049d1a101..fa9395986de 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -7447,7 +7447,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+

[PATCH v4 2/2] preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-08-02 Thread Tom Honermann via Gcc-patches
This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

PR preprocessor/106426

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char
subject to -fchar8_t, -fsigned-char, and/or -funsigned-char.

gcc/testsuite/ChangeLog:
* g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals.
* g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals.

libcpp/ChangeLog:
* charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR
literals based on unsigned_utf8char.
* include/cpplib.h (cpp_options): Add unsigned_utf8char.
* init.cc (cpp_create_reader): Initialize unsigned_utf8char.
---
 gcc/c-family/c-opts.cc| 1 +
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C | 6 +-
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C | 4 
 libcpp/charset.cc | 4 ++--
 libcpp/include/cpplib.h   | 4 ++--
 libcpp/init.cc| 1 +
 6 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index 108adc5caf8..02ce1e86cdb 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1062,6 +1062,7 @@ c_common_post_options (const char **pfilename)
   /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
 flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
+  cpp_opts->unsigned_utf8char = flag_char8_t ? 1 : cpp_opts->unsigned_char;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
index 8ed85ccfdcd..2994dd38516 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
@@ -1,6 +1,6 @@
 // Test that UTF-8 character literals have type char if -fchar8_t is not 
enabled.
 // { dg-do compile }
-// { dg-options "-std=c++17 -fno-char8_t" }
+// { dg-options "-std=c++17 -fsigned-char -fno-char8_t" }
 
 template
   struct is_same
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 > 0
+#error "UTF-8 character literals not signed in preprocessor"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
index 7861736689c..db4fe70046d 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 < 0
+#error "UTF-8 character literals not unsigned in preprocessor"
+#endif
diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index ca8b7cf7aa5..12e31632228 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -1960,8 +1960,8 @@ narrow_str_to_charconst (cpp_reader *pfile, cpp_string 
str,
   /* Multichar constants are of type int and therefore signed.  */
   if (i > 1)
 unsigned_p = 0;
-  else if (type == CPP_UTF8CHAR && !CPP_OPTION (pfile, cplusplus))
-unsigned_p = 1;
+  else if (type == CPP_UTF8CHAR)
+unsigned_p = CPP_OPTION (pfile, unsigned_utf8char);
   else
 unsigned_p = CPP_OPTION (pfile, unsigned_char);
 
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 3eba6f74b57..f9c042db034 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -581,8 +581,8 @@ struct cpp_options
  ints and target wide characters, respectively.  */
   size_t precision, char_precision, int_precision, wchar_precision;
 
-  /* True means chars (wide chars) are unsigned.  */
-  bool unsigned_char, unsigned_wchar;
+  /* True means chars (wide chars, UTF-8 chars) are unsigned.  */
+  bool unsigned_char, unsigned_wchar, unsigned_utf8char;
 
   /* True if the most significant byte in a word has the lowest
  address in memory.  */
diff --git a/libcpp/init.cc b/libcpp/init.cc
index f4ab83d2145..0242da5f55c 100644
--- a/libcpp/init.cc
+++ b/libcpp/init.cc
@@ -231,6 +231,7 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int);
   CPP_OPTION (pfile, unsigned_char) = 0;
   CPP_OPTION (pfile, unsigned_wchar) = 1;
+  CPP_OPTION (pfile, unsigned_utf8char) = 1;
   CPP_OPTION (pfile, bytes_big_endian) = 1;  /* does not matter */
 
   /* Default to no charset conversion.  */
-- 
2.32.0



Re: [PATCH 1/1 v2] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-08-04 Thread Tom Honermann via Gcc-patches
Are there any further concerns with this patch? If not, I extend my 
gratitude to anyone so kind as to commit this for me as I don't have 
commit access.


I just noticed that I neglected to add a ChangeLog entry for the comment 
addition to gcc/cp/parser.cc. Noted inline below. I can re-send the 
patch with that update if desired.


Tom.

On 8/1/22 2:49 PM, Tom Honermann wrote:

Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following diagnostic
otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
   warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

Fixeshttps://gcc.gnu.org/PR106423.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics
in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.


gcc/cp/ChangeLog:
    * parser.cc (cp_lexer_saving_tokens): Add comment regarding 
diagnostic requirements.




gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.
---
  gcc/c-family/c-opts.cc |  7 +++
  gcc/c-family/c.opt |  2 +-
  gcc/cp/parser.cc   |  5 -
  gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
  gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
  libcpp/include/cpplib.h|  4 
  libcpp/init.cc |  1 +
  7 files changed, 46 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..1ea37ba9742 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename)
else if (warn_narrowing == -1)
  warn_narrowing = 0;
  
+  if (cxx_dialect >= cxx20)

+{
+  /* Don't warn about C++20 compatibility changes in C++20 or later.  */
+  warn_cxx20_compat = 0;
+  cpp_opts->cpp_warn_cxx20_compat = 0;
+}
+
/* C++17 has stricter evaluation order requirements; let's use some of them
   for earlier C++ as well, so chaining works as expected.  */
if (c_dialect_cxx ()
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 44e1a60ce24..dfdebd596ef 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -455,7 +455,7 @@ Wc++2a-compat
  C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented
  
  Wc++20-compat

-C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
+C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT)
  Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO 
C++ 2020.
  
  Wc++11-extensions

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4f67441eeb1..c3584446827 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -924,7 +924,10 @@ cp_lexer_saving_tokens (const cp_lexer* lexer)
  /* Store the next token from the preprocessor in *TOKEN.  Return true
 if we reach EOF.  If LEXER is NULL, assume we are handling an
 initial #pragma pch_preprocess, and thus want the lexer to return
-   processed strings.  */
+   processed strings.
+
+   Diagnostics issued from this function must have their controlling option (if
+   any) in c.opt annotated as a libcpp option via the CppReason property.  */
  
  static void

  cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token)
diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C 
b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
new file mode 100644
index 000..d67d01e31ed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++98_only } }
+// { dg-options "-Wc++11-compat" }
+
+// Validate suppression of -Wc++11-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++11-compat"
+int alignof;
+int alignas

Re: [PATCH v4 2/2] preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-08-08 Thread Tom Honermann via Gcc-patches

On 8/2/22 6:14 PM, Joseph Myers wrote:

On Tue, 2 Aug 2022, Tom Honermann via Gcc-patches wrote:


This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

OK in the absence of C++ maintainer objections within 72 hours.  (This is
the case where, when I added support for such literals for C (commit
7c5890cc0a0ecea0e88cc39e9fba6385fb579e61), I raised the question of
whether they should be unsigned in the preprocessor for C++ as well.)


Joseph, would you be so kind as to commit this patch series for me? I 
don't have commit access. Thank you in advance!


Tom.



Re: [PATCH 1/1 v2] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-08-11 Thread Tom Honermann via Gcc-patches
If there are no further concerns, could a C++ or libcpp maintainer 
please commit this for me?


Thank you!

Tom.

On 8/4/22 12:42 PM, Tom Honermann via Gcc-patches wrote:
Are there any further concerns with this patch? If not, I extend my 
gratitude to anyone so kind as to commit this for me as I don't have 
commit access.


I just noticed that I neglected to add a ChangeLog entry for the 
comment addition to gcc/cp/parser.cc. Noted inline below. I can 
re-send the patch with that update if desired.


Tom.

On 8/1/22 2:49 PM, Tom Honermann wrote:

Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the 
preprocessor

(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following 
diagnostic

otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
   warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

Fixeshttps://gcc.gnu.org/PR106423.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat 
diagnostics

in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.


gcc/cp/ChangeLog:
    * parser.cc (cp_lexer_saving_tokens): Add comment regarding 
diagnostic requirements.




gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.
---
  gcc/c-family/c-opts.cc |  7 +++
  gcc/c-family/c.opt |  2 +-
  gcc/cp/parser.cc   |  5 -
  gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
  gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
  libcpp/include/cpplib.h    |  4 
  libcpp/init.cc |  1 +
  7 files changed, 46 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..1ea37ba9742 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename)
    else if (warn_narrowing == -1)
  warn_narrowing = 0;
  +  if (cxx_dialect >= cxx20)
+    {
+  /* Don't warn about C++20 compatibility changes in C++20 or 
later.  */

+  warn_cxx20_compat = 0;
+  cpp_opts->cpp_warn_cxx20_compat = 0;
+    }
+
    /* C++17 has stricter evaluation order requirements; let's use 
some of them

   for earlier C++ as well, so chaining works as expected. */
    if (c_dialect_cxx ()
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 44e1a60ce24..dfdebd596ef 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -455,7 +455,7 @@ Wc++2a-compat
  C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented
    Wc++20-compat
-C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ 
ObjC++,Wall)
+C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ 
ObjC++,Wall) Init(0) CPP(cpp_warn_cxx20_compat) 
CppReason(CPP_W_CXX20_COMPAT)
  Warn about C++ constructs whose meaning differs between ISO C++ 
2017 and ISO C++ 2020.

    Wc++11-extensions
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4f67441eeb1..c3584446827 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -924,7 +924,10 @@ cp_lexer_saving_tokens (const cp_lexer* lexer)
  /* Store the next token from the preprocessor in *TOKEN. Return true
 if we reach EOF.  If LEXER is NULL, assume we are handling an
 initial #pragma pch_preprocess, and thus want the lexer to return
-   processed strings.  */
+   processed strings.
+
+   Diagnostics issued from this function must have their controlling 
option (if
+   any) in c.opt annotated as a libcpp option via the CppReason 
property.  */

    static void
  cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token)
diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C 
b/gcc/testsuite/g++.dg/cpp0x/keywords2.C

new file mode 100644
index 000..d67d01e31ed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++98_only } }
+// { dg-options "-Wc++11-com

[PATCH 0/3]: C N2653 char8_t implementation

2021-06-06 Thread Tom Honermann via Gcc-patches
This series of patches implements the core language features for the 
WG14 N2653 [1] proposal to provide char8_t support in C.  These changes 
are intended to align char8_t support in C with the support provided in 
C++20 via WG21 P0482R6 [2].


These changes do not impact default gcc behavior.  The existing 
-fchar8_t option is extended to C compilation to enable the N2653 
changes, and -fno-char8_t is extended to explicitly disable them.  N2653 
has not yet been accepted by WG14, so no changes are made to handling of 
the C2X language dialect.


Patch 1: Language support
Patch 2: New tests
Patch 3: Documentation updates

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

[2]: WG21 P0482R6
 "char8_t: A type for UTF-8 characters and strings (Revision 6)"
 https://wg21.link/p0482r6


[PATCH 1/3]: C N2653 char8_t: Language support

2021-06-06 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library 
changes proposed in WG14 N2653 [1] for C.  The changes include:

- Use of the existing -fchar8_t and -fno-char8_t options to opt-in to
  (or opt-out of) the following changes when compiling C code.
- Change of type for UTF-8 string literals from array of char to array
  of char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of a new
  predefined ATOMIC_CHAR8_T_LOCK_FREE macro.

When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE 
macro is predefined.  This is the mechanism proposed to glibc to opt-in 
to declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions 
proposed in N2653.  See [2].


Tested on Linux x86_64.

gcc/ChangeLog:

2021-05-31  Tom Honermann  

 * ginclude/stdatomic.h (atomic_char8_t, ATOMIC_CHAR8_T_LOCK_FREE):
   New typedef and macro.

gcc/c/ChangeLog:

2021-05-31  Tom Honermann  

 * c-parser.c (c_parser_string_literal): Use char8_t as the type of
   CPP_UTF8STRING when char8_t support is enabled.
 * c-typeck.c (digest_init): Handle initialization of an array
   of character type by a string literal with type array of
   unsigned char.

gcc/c-family/ChangeLog:

2021-05-31  Tom Honermann  

 * c-cppbuiltin.c (c_cpp_builtins): Define _CHAR8_T_SOURCE if
   char8_t support is enabled in non-C++ language modes.
 * c-lex.c (lex_string): Use char8_t as the type of
   CPP_UTF8STRING when char8_t support is enabled.
 * c-opts.c (c_common_handle_option): Inform the preprocessor if
   char8_t support is enabled.
 * c.opt (fchar8_t): Enable for C language modes.

libcpp/ChangeLog:

2021-05-31  Tom Honermann  

 * include/cpplib.h (cpp_options): Add char8.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

[2]: C++20 P0482R6 and C2X N2653: support for char8_t, mbrtoc8(), and 
c8rtomb().
 [Patch 0]: 
https://sourceware.org/pipermail/libc-alpha/2021-June/127230.html
 [Patch 1]: 
https://sourceware.org/pipermail/libc-alpha/2021-June/127231.html
 [Patch 2]: 
https://sourceware.org/pipermail/libc-alpha/2021-June/127232.html
 [Patch 3]: 
https://sourceware.org/pipermail/libc-alpha/2021-June/127233.html
commit c4260c7c49822522945377cc2fb93ee9830cefc8
Author: Tom Honermann 
Date:   Sat Feb 13 09:02:34 2021 -0500

N2653 char8_t for C: Language support

This patch implements the core language and compiler dependent library
changes proposed in WG14 N2653 for C.  The changes include:
- Use of the existing -fchar8_t and -fno-char8_t options to opt-in to
  (or opt-out of) the following changes when compiling C code.
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of a new
  predefined ATOMIC_CHAR8_T_LOCK_FREE macro.

When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE
macro is predefined.  This is the mechanism proposed to glibc to opt-in
to declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions
proposed in N2653.

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 42b7604c9ac..3e944ec2b86 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1467,6 +1467,11 @@ c_cpp_builtins (cpp_reader *pfile)
   if (flag_iso)
 cpp_define (pfile, "__STRICT_ANSI__");
 
+  /* Express intent for char8_t support in C (not C++) to the C library if
+ requested.  */
+  if (!c_dialect_cxx () && flag_char8_t)
+cpp_define (pfile, "_CHAR8_T_SOURCE");
+
   if (!flag_signed_char)
 cpp_define (pfile, "__CHAR_UNSIGNED__");
 
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index c44e7a13489..e30e44e9f5c 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -1335,7 +1335,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate)
 	default:
 	case CPP_STRING:
 	case CPP_UTF8STRING:
-	  value = build_string (1, "");
+	  if (type == CPP_UTF8STRING && flag_char8_t)
+	{
+	  value = build_string (TYPE_PRECISION (char8_type_node)
+/ TYPE_PRECISION (char_type_node),
+"");  /* char8_t is 8 bits */
+	}
+	  else
+	value = build_string (1, "");
 	  break;
 	case CPP_STRING16:
 	  value = build_string (TYPE_PRECISION (char16_type_node)
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 60b5802722c..eefc607dac6 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -718,6 +718,10 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT

[PATCH 2/3]: C N2653 char8_t: New tests​

2021-06-06 Thread Tom Honermann via Gcc-patches
This patch provides new tests for the core language and compiler 
dependent library changes proposed in WG14 N2653 [1] for C.


Most of the tests are provided in both a positive (-fchar8_t) and 
negative (-fno-char8_t) form to ensure behaviors are appropriately 
present or absent in each mode.


Tested on Linux x86_64.

gcc/testsuite/ChangeLog:

2021-05-31  Tom Honermann  

* gcc.dg/atomic/stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/char8_t-init-string-literal-1.c: New test.
* gcc.dg/char8_t-predefined-macros-1.c: New test.
* gcc.dg/char8_t-predefined-macros-2.c: New test.
* gcc.dg/char8_t-string-literal-1.c: New test.
* gcc.dg/char8_t-string-literal-2.c: New test.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm


commit 900aa3507defd80339828e5791c215a28efd9fea
Author: Tom Honermann 
Date:   Sat Feb 13 10:02:41 2021 -0500

N2653 char8_t for C: New tests

This change provides new tests for the core language and compiler
dependent library changes proposed in WG14 N2653 for C.

Some of the tests are provided in both a positive (-fchar8_t) and
negative (-fno-char8_t) form to ensure behaviors are appropriately
present or absent in each mode.

diff --git a/gcc/testsuite/gcc.dg/atomic/stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..bb9eae84e83
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c11 -fchar8_t -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)		\
+  do		\
+{		\
+  int r1 = MACRO;\
+  int r2 = atomic_is_lock_free (&V1);	\
+  int r3 = atomic_is_lock_free (&V2);	\
+  if (r1 != 0 && r1 != 1 && r1 != 2)	\
+	abort ();\
+  if (r2 != 0 && r2 != 1)			\
+	abort ();\
+  if (r3 != 0 && r3 != 1)			\
+	abort ();\
+  if (r1 == 2 && r2 != 1)			\
+	abort ();\
+  if (r1 == 2 && r3 != 1)			\
+	abort ();\
+  if (r1 == 0 && r2 != 0)			\
+	abort ();\
+  if (r1 == 0 && r3 != 0)			\
+	abort ();\
+}		\
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/char8_t-init-string-literal-1.c b/gcc/testsuite/gcc.dg/char8_t-init-string-literal-1.c
new file mode 100644
index 000..4d587e90a26
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/char8_t-init-string-literal-1.c
@@ -0,0 +1,13 @@
+/* Test that char, signed char, and unsigned char arrays can still be
+   initialized by UTF-8 string literals if -fchar8_t is enabled.  */
+/* { dg-do compile } */
+/* { dg-options "-fchar8_t" } */
+
+char cbuf1[] = u8"text";
+char cbuf2[] = { u8"text" };
+
+signed char scbuf1[] = u8"text";
+signed char scbuf2[] = { u8"text" };
+
+unsigned char ucbuf1[] = u8"text";
+unsigned char ucbuf2[] = { u8"text" };
diff --git a/gcc/testsuite/gcc.dg/char8_t-predefined-macros-1.c b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-1.c
new file mode 100644
index 000..884c634990d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-1.c
@@ -0,0 +1,16 @@
+// Test that char8_t related predefined macros are not present when -fchar8_t is
+// not enabled.
+// { dg-do compile }
+// { dg-options "-fno-char8_t" }
+
+#if defined(_CHAR8_T_SOURCE)
+# error _CHAR8_T_SOURCE is defined!
+#endif
+
+#if defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is defined!
+#endif
+
+#if defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/char8_t-predefined-macros-2.c b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-2.c
new file mode 100644
index 000..7f425357f57
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-2.c
@@ -0,0 +1,16 @@
+// Test that char8_t related predefined macros are present when -fchar8_t is
+// enabled.
+// { dg-do compile }
+// { dg-options "-fchar8_t" }
+
+#if !defined(_CHAR8_T_SOURCE)
+# error _CHAR8_T_SOURCE is not defined!
+#endif
+
+#if !defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is not defined!
+#endif
+
+#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/char8_t-string-literal-1.c b/gcc/testsuite/gcc.dg/char8_t-string-literal-1.c
new file mode 100644
index 000..df94582ac1d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/char8_t-string-literal-1.c
@@ -0,0 +1,6 @@
+// Test tha

[PATCH 3/3]: C N2653 char8_t: Documentation updates

2021-06-06 Thread Tom Honermann via Gcc-patches
This patch updates documentation for the -fchar8_t and -fno-char8_t 
options to describe their effect on C code as proposed in WG14 N2653 [1].


Tested on Linux x86_64.

2021-05-31  Tom Honermann  

* doc/invoke.texi (-fchar8_t): update for char8_t support for C.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm


commit d3cb3c6648cc15fe1beea6c9799e044cb722148a
Author: Tom Honermann 
Date:   Sun May 30 16:57:09 2021 -0400

N2653 char8_t for C: Documentation updates

This change updates documentation for the -fchar8_t option to describe
its affect on C code as proposed in WG14 N2653 for C.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5cd4e2d993c..ba4c60a6179 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2884,14 +2884,27 @@ This flag is enabled by default for @option{-std=c++17}.
 @itemx -fno-char8_t
 @opindex fchar8_t
 @opindex fno-char8_t
-Enable support for @code{char8_t} as adopted for C++20.  This includes
-the addition of a new @code{char8_t} fundamental type, changes to the
-types of UTF-8 string and character literals, new signatures for
-user-defined literals, associated standard library updates, and new
-@code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature test macros.
+Enable support for @code{char8_t} for C as proposed in N2653, and for
+C++ as adopted for C++20.
+
+For C, this changes the type of UTF-8 string literals from array of
+@code{char} to array of @code{unsigned char} and defines the
+@code{_CHAR8_T_SOURCE} macro to inform the C standard library that the
+@code{char8_t} typedef name and the @code{mbrtoc8} and @code{c8rtomb}
+functions should be declared by @code{}, and that the
+@code{atomic_char8_t} typedef name and the @code{ATOMIC_CHAR8_T_LOCK_FREE}
+macro should be defined by @code{}.
+
+For C++, this enables the @code{char8_t} fundamental type, changes the
+type of UTF-8 string literals from array of @code{char} to array of
+@code{char8_t}, changes the type of character literals from @code{char}
+to @code{char8_t}, adds additional @code{char8_t}-based signatures for
+user-defined literals, enables associated standard library updates, and
+defines the @code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature
+test macros.
 
 This option enables functions to be overloaded for ordinary and UTF-8
-strings:
+strings in C++:
 
 @smallexample
 int f(const char *);// #1





Re: [PATCH 0/3]: C N2653 char8_t implementation

2021-06-11 Thread Tom Honermann via Gcc-patches

On 6/7/21 5:03 PM, Joseph Myers wrote:

On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote:


These changes do not impact default gcc behavior.  The existing -fchar8_t
option is extended to C compilation to enable the N2653 changes, and
-fno-char8_t is extended to explicitly disable them.  N2653 has not yet been
accepted by WG14, so no changes are made to handling of the C2X language
dialect.

Why is that option needed?  Normally I'd expect features to be enabled or
disabled based on the selected language version, rather than having
separate options to adjust the configuration for one very specific feature
in a language version.  Adding extra language dialects not corresponding
to any standard version but to some peculiar mix of versions (such as C17
with a changed type for u8"", or C2X with a changed type for u8'') needs a
strong reason for those language dialects to be useful (for example, the
-fgnu89-inline option was justified by widespread use of GNU-style extern
inline in headers).


The option is needed because it impacts core language backward 
compatibility (for both C and C++, the type of u8 string literals; for 
C++, the type of u8 character literals and the new char8_t fundamental 
type).


The ability to opt-in or opt-out of the feature eases migration by 
enabling source code compatibility.  C and C++ standards are not 
published at the same cadence.  A project that targets C++20 and C17 may 
therefore have a need to either opt-out of char8_t support on the C++ 
side (already possible via -fno-char8_t), or to opt-in to char8_t 
support on the C side until such time as the targets change to C++20(+) 
and C23(+); assuming WG14 approval at some point.




I think the whole patch series would best wait until after the proposal
has been considered by a WG14 meeting, in addition to not increasing the
number of language dialects supported.


As an opt-in feature, this is useful to gain implementation and 
deployment experience for WG14.


It would be appropriate to document this as an experimental feature 
pending WG14 approval.  If WG14 declines it or approves it with 
different behavior, the feature can then be removed or changed.


The option could also be introduced as -fexperimental-char8_t if that 
eases concerns, though I do not favor that approach due to misalignment 
with the existing option for C++.


Tom.



Re: [PATCH 1/3]: C N2653 char8_t: Language support

2021-06-11 Thread Tom Honermann via Gcc-patches

On 6/7/21 5:11 PM, Joseph Myers wrote:

On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote:


When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE macro
is predefined.  This is the mechanism proposed to glibc to opt-in to
declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions proposed
in N2653.  See [2].

I don't think glibc should have such a feature test macro, and I don't
think GCC should define such feature test macros either - _*_SOURCE macros
are generally for the *user* to define to decide what namespace they want
visible, not for the compiler to define.  Without proliferating new
language dialects, __STDC_VERSION__ ought to be sufficient to communicate
from the compiler to the library (including to GCC's own headers such as
stdatomic.h).

In general I agree, but I think an exception is warranted in this case 
for a few reasons:


1. The feature includes both core language changes (the change of type
   for u8 string literals) and library changes.  The library changes
   are not actually dependent on the core language change, but they are
   intended to be used together.
2. Existing use of the char8_t identifier can be found in existing open
   source projects and likely exists in some closed source projects as
   well.  An opt-in approach avoids conflict and the need to
   conditionalize code based on gcc version.
3. An opt-in approach enables evaluation of the feature prior to any
   WG14 approval.

Tom.



Re: [PATCH 1/3]: C N2653 char8_t: Language support

2021-06-11 Thread Tom Honermann via Gcc-patches

On 6/7/21 5:12 PM, Joseph Myers wrote:

Also, it seems odd to add a new field to cpp_options without any code in
libcpp that uses the value of that field.

Ah, thank you.  That appears to be leftover code from prior 
experimentation and I failed to identify it as such when preparing the 
patch.  I'll provide a revised patch.


Tom.



Re: [PATCH 1/3]: C N2653 char8_t: Language support

2021-06-11 Thread Tom Honermann via Gcc-patches

On 6/11/21 12:01 PM, Jakub Jelinek wrote:

On Fri, Jun 11, 2021 at 11:52:41AM -0400, Tom Honermann via Gcc-patches wrote:

On 6/7/21 5:11 PM, Joseph Myers wrote:

On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote:


When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE macro
is predefined.  This is the mechanism proposed to glibc to opt-in to
declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions proposed
in N2653.  See [2].

I don't think glibc should have such a feature test macro, and I don't
think GCC should define such feature test macros either - _*_SOURCE macros
are generally for the *user* to define to decide what namespace they want
visible, not for the compiler to define.  Without proliferating new
language dialects, __STDC_VERSION__ ought to be sufficient to communicate
from the compiler to the library (including to GCC's own headers such as
stdatomic.h).


In general I agree, but I think an exception is warranted in this case for a
few reasons:

1. The feature includes both core language changes (the change of type
for u8 string literals) and library changes.  The library changes
are not actually dependent on the core language change, but they are
intended to be used together.
2. Existing use of the char8_t identifier can be found in existing open
source projects and likely exists in some closed source projects as
well.  An opt-in approach avoids conflict and the need to
conditionalize code based on gcc version.
3. An opt-in approach enables evaluation of the feature prior to any
WG14 approval.

But calling it _CHAR8_T_SOURCE is weird and inconsistent with everything
else.
In C++, there is __cpp_char8_t 201811L predefined macro for char8_t.
Using that in C is not right, sure.
Often we use __SIZEOF_type__ macros not just for sizeof(), but also for
presence check of the types, like
#ifdef __SIZEOF_INT128__
__int128 i;
#else
long long i;
#endif
etc., while char8_t has sizeof (char8_t) == 1, perhaps predefining
__SIZEOF_CHAR8_T__ 1
instead of _CHAR8_T_SOURCE would be better?


I'm open to whatever signaling mechanism would be preferred.  It took me 
a while to settle on _CHAR8_T_SOURCE as the mechanism to propose as I 
didn't find much for other precedents.


I agree that having _CHAR8_T_SOURCE be implied by the -fchar8_t option 
is unusual with respect to other feature test macros.  Is that what you 
find to be weird and inconsistent?


Predefining __SIZEOF_CHAR8_T__ would be consistent with 
__SIZEOF_WCHAR_T__, but kind of strange too since the size is always 1.


Perhaps a better approach would be to follow the __CHAR16_TYPE__ and 
__CHAR32_TYPE__ precedent and define __CHAR8_TYPE__ to unsigned char.  
That is likewise a bit strange since the type would always be unsigned 
char, but it does provide a bit more symmetry.  That could potentially 
have some use as well; for C++, it could be defined as char8_t and 
thereby reflect the difference between the two languages.  Perhaps it 
could be useful in the future as well if WG14 were to add distinct 
char8_t, char16_t, and char32_t types as C++ did (I'm not offering any 
prediction regarding the likelihood of that happening).


Tom.



Jakub





Re: [PATCH 1/3]: C N2653 char8_t: Language support

2021-06-13 Thread Tom Honermann via Gcc-patches

On 6/11/21 12:53 PM, Jakub Jelinek wrote:

On Fri, Jun 11, 2021 at 12:20:48PM -0400, Tom Honermann wrote:

I'm open to whatever signaling mechanism would be preferred.  It took me a
while to settle on _CHAR8_T_SOURCE as the mechanism to propose as I didn't
find much for other precedents.

I agree that having _CHAR8_T_SOURCE be implied by the -fchar8_t option is
unusual with respect to other feature test macros.  Is that what you find to
be weird and inconsistent?

Predefining __SIZEOF_CHAR8_T__ would be consistent with __SIZEOF_WCHAR_T__,
but kind of strange too since the size is always 1.

Perhaps a better approach would be to follow the __CHAR16_TYPE__ and
__CHAR32_TYPE__ precedent and define __CHAR8_TYPE__ to unsigned char.  That
is likewise a bit strange since the type would always be unsigned char, but
it does provide a bit more symmetry.  That could potentially have some use
as well; for C++, it could be defined as char8_t and thereby reflect the
difference between the two languages.  Perhaps it could be useful in the
future as well if WG14 were to add distinct char8_t, char16_t, and char32_t
types as C++ did (I'm not offering any prediction regarding the likelihood
of that happening).

C++ already predefines
#define __CHAR8_TYPE__ unsigned char
#define __CHAR16_TYPE__ short unsigned int
#define __CHAR32_TYPE__ unsigned int
for -std={c,gnu}++2{0,a,3,b} or -fchar8_t (unless -fno-char8_t), so I agree
just making sure __CHAR8_TYPE__ is defined to unsigned char even for C
is best.
And you probably don't need to do anything in the C patch for it,
void
c_stddef_cpp_builtins(void)
{
   builtin_define_with_value ("__SIZE_TYPE__", SIZE_TYPE, 0);
...
   if (flag_char8_t)
 builtin_define_with_value ("__CHAR8_TYPE__", CHAR8_TYPE, 0);
   builtin_define_with_value ("__CHAR16_TYPE__", CHAR16_TYPE, 0);
   builtin_define_with_value ("__CHAR32_TYPE__", CHAR32_TYPE, 0);
will do that.


Thank you; I had forgotten that I had already done that work.  I 
confirmed that the proposed changes result in __CHAR8_TYPE__ being 
defined (the tests included with the patch already enforced it).


Tom.



Jakub





Re: [PATCH 0/3]: C N2653 char8_t implementation

2021-06-13 Thread Tom Honermann via Gcc-patches

On 6/11/21 1:27 PM, Joseph Myers wrote:

On Fri, 11 Jun 2021, Tom Honermann via Gcc-patches wrote:


The option is needed because it impacts core language backward compatibility
(for both C and C++, the type of u8 string literals; for C++, the type of u8
character literals and the new char8_t fundamental type).

Lots of new features in new standard versions can affect backward
compatibility.  We generally bundle all of those up into a single -std
option rather than having an explosion of different language variants with
different features enabled or disabled.  I don't think this feature, for
C, reaches the threshold that would justify having a separate option to
control it, especially given that people can use -Wno-pointer-sign or
pointer casts or their own local char8_t typedef as an intermediate step
if they want code using u8"" strings to work for both old and new standard
versions.
Ok, I'm happy to defer to your experience.  My perspective is likely 
biased by the C++20 changes being more disruptive for that language.


I don't think u8"" strings are widely used in C library headers in a way
where the choice of type matters.  (Use of a feature in library headers is
a key thing that can justify options such as -fgnu89-inline, because it
means the choice of language version is no longer fully under control of a
single project.)

That aligns with my expectations.


The only feature proposed for C2x that I think is likely to have
significant compatibility implications in practice for a lot of code is
making bool, true and false into keywords.  I still don't think a separate
option makes sense there.  (If that feature is accepted for C2x, what
would be useful is for people to do distribution rebuilds with -std=gnu2x
as the default to find and fix code that breaks, in advance of the default
actually changing in GCC.  But the workaround for not-yet-fixed code would
be -std=gnu11, not a separate option for that one feature.)

Ok, that comparison is helpful.



I think the whole patch series would best wait until after the proposal
has been considered by a WG14 meeting, in addition to not increasing the
number of language dialects supported.

As an opt-in feature, this is useful to gain implementation and deployment
experience for WG14.

I think this feature is one of the cases where experience in C++ is
sufficiently relevant for C (although there are certainly cases of other
language features where the languages are sufficiently different that
using C++ experience like that can be problematic).

E.g. we didn't need -fdigit-separators for C before digit separators were
added to C2x, and we don't need -fno-digit-separators now they are in C2x
(the feature is just enabled or disabled based on the language version),
although that's one of many features that do affect compatibility in
corner cases.


Got it, thanks again, that comparison is helpful.

Per this and prior messages, I'll revise the gcc patch series as follows 
(I'll likewise revise the glibc changes, but will detail that in the 
corresponding glibc mailing list thread).


1. Remove the proposed use of -fchar8_t and -fno-char8_t for C code.
2. Remove the updated documentation for the -fchar8_t option since it
   won't be applicable to C code.
3. Remove the _CHAR8_T_SOURCE macro.
4. Enable the change of u8 string literal type based on -std=[gnu|c]2x
   (by setting flag_char8_t if flag_isoc2x is set).
5. Condition the declarations of atomic_char8_t and
   __GCC_ATOMIC_CHAR8_T_LOCK_FREE on _GNU_SOURCE or _ISOC2X_SOURCE.
6. Remove the char8 data member from cpp_options that I had added and
   forgot to remove.
7. Revise the tests and rename them for consistency with other C2x tests.

If I've forgotten anything, please let me know.

Thank you for the thorough review!

Tom.



[PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library

2022-01-07 Thread Tom Honermann via Gcc-patches
This patch completes implementation of the C++20 proposal P0482R6 [1] by 
adding declarations of std::c8rtomb() and std::mbrtoc8() in  if 
provided by the C library in .


This patch addresses feedback provided in response to a previous patch 
submission [2].


Autoconf changes determine if the C library declares c8rtomb and mbrtoc8 
at global scope when uchar.h is included and compiled with either 
-fchar8_t or -std=c++20. New _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T 
and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros 
reflect the probe results. The  header declares these functions 
in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T 
configuration macro is defined (by default it is defined if the C++20 
__cpp_char8_t feature test macro is defined)


Patches to glibc to implement c8rtomb and mbrtoc8 have been submitted [3].

New tests validate the presence of these declarations. The tests pass 
trivially if the C library does not provide these functions. Otherwise 
they ensure that the functions are declared when  is included 
and either -fchar8_t or -std=c++20 is enabled.


Tested on Linux x86_64.

libstdc++-v3/ChangeLog:

2022-01-07  Tom Honermann  

* acinclude.m4 Define config macros if uchar.h provides
c8rtomb() and mbrtoc8().
* config.h.in: Re-generate.
* configure: Re-generate.
* include/c_compatibility/uchar.h: Declare ::c8rtomb and
::mbrtoc8.
* include/c_global/cuchar: Declare std::c8rtomb and
std::mbrtoc8.
* include/c_std/cuchar: Declare std::c8rtomb and std::mbrtoc8.
* testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
New test.
* testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc:
New test.

Tom.

[1]: WG21 P0482R6
 "char8_t: A type for UTF-8 characters and strings (Revision 6)"
 https://wg21.link/p0482r6

[2]: [PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 
if provided by the C library

 https://gcc.gnu.org/pipermail/libstdc++/2021-June/052685.html

[3]: "C++20 P0482R6 and C2X N2653"
 [Patch 0/3]: 
https://sourceware.org/pipermail/libc-alpha/2022-January/135061.html
 [Patch 1/3]: 
https://sourceware.org/pipermail/libc-alpha/2022-January/135062.html
 [Patch 2/3]: 
https://sourceware.org/pipermail/libc-alpha/2022-January/135063.html
 [Patch 3/3]: 
https://sourceware.org/pipermail/libc-alpha/2022-January/135064.html


Tom.
commit 3d40bc9bf5c79343ea5a6cc355539542f4b56c9b
Author: Tom Honermann 
Date:   Sat Jan 1 17:26:31 2022 -0500

P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library.

This change completes implementation of the C++20 proposal P0482R6 by
adding declarations of std::c8rtomb() and std::mbrtoc8() if provided
by the C library.

Autoconf changes determine if the C library declares c8rtomb and mbrtoc8
at global scope when uchar.h is included and compiled with either -fchar8_t
or -std=c++20 enabled; new _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T and
_GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros are defined
accordingly. The  header declares these functions in the std
namespace only if available and the _GLIBCXX_USE_CHAR8_T configuration
macro is defined (by default it is defined if the C++20 __cpp_char8_t
feature test macro is defined).

New tests validate the presence of these declarations. The tests pass
trivially if the C library does not provide these functions. Otherwise they
ensure that the functions are declared when  is included and
either -fchar8_t or -std=c++20 is enabled.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 635168d7e25..85235005c7e 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -2039,6 +2039,50 @@ AC_DEFUN([GLIBCXX_CHECK_UCHAR_H], [
 	  namespace std in .])
   fi
 
+  CXXFLAGS="$CXXFLAGS -fchar8_t"
+  if test x"$ac_has_uchar_h" = x"yes"; then
+AC_MSG_CHECKING([for c8rtomb and mbrtoc8 in  with -fchar8_t])
+AC_TRY_COMPILE([#include 
+		namespace test
+		{
+		  using ::c8rtomb;
+		  using ::mbrtoc8;
+		}
+		   ],
+		   [], [ac_uchar_c8rtomb_mbrtoc8_fchar8_t=yes],
+		   [ac_uchar_c8rtomb_mbrtoc8_fchar8_t=no])
+  else
+ac_uchar_c8rtomb_mbrtoc8_fchar8_t=no
+  fi
+  AC_MSG_RESULT($ac_uchar_c8rtomb_mbrtoc8_fchar8_t)
+  if test x"$ac_uchar_c8rtomb_mbrtoc8_fchar8_t" = x"yes"; then
+AC_DEFINE(_GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T, 1,
+	  [Define if c8rtomb and mbrtoc8 functions in  should be
+	  imported into namespace std in  for -fchar8_t.])
+  fi
+
+  CXXFLAGS="$CXXFLAGS -std=c++20"
+  if test x"$ac_has_uchar_h" = x"yes"; then
+AC_MSG_CHECKING([for c8rtomb and mbrtoc8 in  with -std=c++20])
+AC_TRY_

[PATCH 0/2]: C N2653 char8_t implementation

2022-01-07 Thread Tom Honermann via Gcc-patches
This series of patches implements the core language features for the 
WG14 N2653 [1] proposal to provide char8_t support in C. These changes 
are intended to align char8_t support in C with the support provided in 
C++20 via WG21 P0482R6 [2].


These patches addresses feedback provided in response to a previous 
submission [3][4].


These changes do not impact default gcc behavior. Per prior feedback by 
Joseph Myers, the existing -fchar8_t and -fno-char8_t options used to 
opt-in to or opt-out of char8_t support in C++ are NOT reused for C. 
Instead, the C related core language changes are enabled when targeting 
C2x. Note that N2653 has not yet been accepted by WG14 for C2x, but the 
patches enable these changes for C2x in order to avoid an additional 
language dialect flag (e.g., -fchar8_t).


Patch 1: Language support
Patch 2: New tests

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

[2]: WG21 P0482R6
 "char8_t: A type for UTF-8 characters and strings (Revision 6)"
 https://wg21.link/p0482r6

[3]: [PATCH 0/3]: C N2653 char8_t implementation
 https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572022.html

[4]: [PATCH 1/3]: C N2653 char8_t: Language support
 https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572023.html


[PATCH 1/2]: C N2653 char8_t: Language support

2022-01-07 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library 
changes proposed in WG14 N2653 [1] for C2x. The changes include:

- Change of type for UTF-8 string literals from array of char to array
  of char8_t (unsigned char) when targeting C2x.
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

Tested on Linux x86_64.

gcc/ChangeLog:

2022-01-07  Tom Honermann  

* ginclude/stdatomic.h (atomic_char8_t,
ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

2022-01-07  Tom Honermann  

* c-parser.c (c_parser_string_literal): Use char8_t as the type
of CPP_UTF8STRING when char8_t support is enabled.
* c-typeck.c (digest_init): Allow initialization of an array
of character type by a string literal with type array of
char8_t.

gcc/c-family/ChangeLog:

2022-01-07  Tom Honermann  

* c-lex.c (lex_string, lex_charconst): Use char8_t as the type
of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
enabled.
* c-opts.c (c_common_post_options): Set flag_char8_t if
targeting C2x.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm
commit c041cce5d262908349be3f1f2e361c824db15845
Author: Tom Honermann 
Date:   Sat Jan 1 18:10:41 2022 -0500

N2653 char8_t for C: Language support

This patch implements the core language and compiler dependent library
changes proposed in WG14 N2653 for C2X.  The changes include:
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 2651331e683..0b3debbb9bd 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate)
 	default:
 	case CPP_STRING:
 	case CPP_UTF8STRING:
-	  value = build_string (1, "");
+	  if (type == CPP_UTF8STRING && flag_char8_t)
+	{
+	  value = build_string (TYPE_PRECISION (char8_type_node)
+/ TYPE_PRECISION (char_type_node),
+"");  /* char8_t is 8 bits */
+	}
+	  else
+	value = build_string (1, "");
 	  break;
 	case CPP_STRING16:
 	  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -1425,10 +1432,10 @@ lex_charconst (const cpp_token *token)
 type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
 {
-  if (!c_dialect_cxx ())
-	type = unsigned_char_type_node;
-  else if (flag_char8_t)
+  if (flag_char8_t)
 type = char8_type_node;
+  else if (!c_dialect_cxx ())
+	type = unsigned_char_type_node;
   else
 type = char_type_node;
 }
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 4c20e44f5b5..bd96e1319ad 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -1060,9 +1060,9 @@ c_common_post_options (const char **pfilename)
   if (flag_sized_deallocation == -1)
 flag_sized_deallocation = (cxx_dialect >= cxx14);
 
-  /* char8_t support is new in C++20.  */
+  /* char8_t support is implicitly enabled in C++20 and C2x.  */
   if (flag_char8_t == -1)
-flag_char8_t = (cxx_dialect >= cxx20);
+flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index b09ad307acd..4239633e295 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7439,7 +7439,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok)
 	default:
 	case CPP_STRING:
 	case CPP_UTF8STRING:
-	  value = build_string (1, "");
+	  if (type == CPP_UTF8STRING && flag_char8_t)
+	{
+	  value = build_string (TYPE_PRECISION (char8_type_node)
+/ TYPE_PRECISION (char_type_node),
+"");  /* char8_t is 8 bits */
+	}
+	  else
+	value = build_string (1, "");
 	  break;
 	case CPP_STRING16:
 	  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -7464,9 +7471,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok)
 {
 default:
 case CPP_STRING:
-case CPP_UTF8STRING:
   TREE_TYPE (value) = char_array_type_node;
   break;
+case CPP_UTF8STRING:
+  if (flag_char8_t)
+	TREE_TYPE (value) = char8_array_type_node;
+  else
+	TREE_TYPE (value) = char_array_type_node;
+  break;
 case CPP_STRING16:
   TREE_TYPE (value) = char16_array_type_node;
   break;
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 78a6c68aaa6..b4eeea545a9 100644
--- a/gcc/c/c

[PATCH 2/2]: C N2653 char8_t: New tests​

2022-01-07 Thread Tom Honermann via Gcc-patches
This patch provides new tests for the core language and compiler 
dependent library changes proposed in WG14 N2653 [1] for C2x.


Tested on Linux x86_64.

gcc/testsuite/ChangeLog:

2021-05-31  Tom Honermann  

* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c2x-predefined-macros.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-predefined-macros.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm
commit f4eee2bf403b62714d1ccb4542b8c85dc552a411
Author: Tom Honermann 
Date:   Sun Jan 2 00:26:17 2022 -0500

N2653 char8_t for C: New tests

This change provides new tests for the core language and compiler
dependent library changes proposed in WG14 N2653 for C.

diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..37ea4c8926c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)		\
+  do		\
+{		\
+  int r1 = MACRO;\
+  int r2 = atomic_is_lock_free (&V1);	\
+  int r3 = atomic_is_lock_free (&V2);	\
+  if (r1 != 0 && r1 != 1 && r1 != 2)	\
+	abort ();\
+  if (r2 != 0 && r2 != 1)			\
+	abort ();\
+  if (r3 != 0 && r3 != 1)			\
+	abort ();\
+  if (r1 == 2 && r2 != 1)			\
+	abort ();\
+  if (r1 == 2 && r3 != 1)			\
+	abort ();\
+  if (r1 == 0 && r2 != 0)			\
+	abort ();\
+  if (r1 == 0 && r3 != 0)			\
+	abort ();\
+}		\
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..a017b134817
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */
+
+#include "c2x-stdatomic-lockfree-char8_t.c"
diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
new file mode 100644
index 000..c88e51b54c5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
@@ -0,0 +1,11 @@
+/* Test C2x predefined macros.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+#if !defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is not defined!
+#endif
+
+#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str-type.c b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
new file mode 100644
index 000..76559c0b19b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C2x UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, unsigned char*: 2) == 2, "UTF-8 string literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, unsigned char:  2) == 2, "UTF-8 string literal elements have an unexpected type");
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str.c b/gcc/testsuite/gcc.dg/c2x-utf8str.c
new file mode 100644
index 000..712482c6569
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-utf8str.c
@@ -0,0 +1,34 @@
+/* Test initialization by UTF-8 string literal in C2x.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=c2x" } */
+
+typedef __CHAR8_TYPE__	char8_t;
+typedef __CHAR16_TYPE__	char16_t;
+typedef __CHAR32_TYPE__ char32_t;
+typedef __WCHAR_TYPE__	wchar_t;
+
+/* Test that char, signed char, unsigned char, and char8_t arrays can be
+   initialized by a UTF-8 string literal.  */
+const char cbuf1[] = u8"text";
+const char cbuf2[] = { u8"text" };
+const signed char scbuf1[] = u8"text";
+const signed char scbuf2[] = { u8"text" };
+const unsigned char ucbuf1[] = u8"text";
+const unsigned char ucbuf2[] = { u8"text" };
+const char8_t

Re: [PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library

2022-01-10 Thread Tom Honermann via Gcc-patches

On 1/10/22 8:23 AM, Jonathan Wakely wrote:



On Sat, 8 Jan 2022 at 00:42, Tom Honermann via Libstdc++ 
mailto:libstdc%2b...@gcc.gnu.org>> wrote:


This patch completes implementation of the C++20 proposal P0482R6
[1] by
adding declarations of std::c8rtomb() and std::mbrtoc8() in
 if
provided by the C library in .

This patch addresses feedback provided in response to a previous
patch
submission [2].

Autoconf changes determine if the C library declares c8rtomb and
mbrtoc8
at global scope when uchar.h is included and compiled with either
-fchar8_t or -std=c++20. New
_GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T
and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros
reflect the probe results. The  header declares these
functions
in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T
configuration macro is defined (by default it is defined if the C++20
__cpp_char8_t feature test macro is defined)

Patches to glibc to implement c8rtomb and mbrtoc8 have been
submitted [3].

New tests validate the presence of these declarations. The tests pass
trivially if the C library does not provide these functions.
Otherwise
they ensure that the functions are declared when  is included
and either -fchar8_t or -std=c++20 is enabled.

Tested on Linux x86_64.

libstdc++-v3/ChangeLog:

2022-01-07  Tom Honermann  mailto:t...@honermann.net>>

        * acinclude.m4 Define config macros if uchar.h provides
        c8rtomb() and mbrtoc8().
        * config.h.in <http://config.h.in>: Re-generate.
        * configure: Re-generate.
        * include/c_compatibility/uchar.h: Declare ::c8rtomb and
        ::mbrtoc8.
        * include/c_global/cuchar: Declare std::c8rtomb and
        std::mbrtoc8.
        * include/c_std/cuchar: Declare std::c8rtomb and std::mbrtoc8.
        * testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
        New test.
        *
testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc:
        New test.



Thanks, Tom, this looks good and I'll get it committed for GCC 12.

Thank you!


My only concern is that the new tests depend on an internal macro:

+#if _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20
+  using std::mbrtoc8;
+  using std::c8rtomb;

I prefer if tests are written as "user code" when possible, and not 
using our internal macros. That isn't always possible, and in this 
case would require adding new effective-target keyword to 
testsuite/lib/libstdc++.exp just for use in these two tests. I don't 
think we should bother with that.
I went with this approach solely due to my unfamiliarity with the test 
system. I knew there should be a way to conditionally make the test 
"pass" as unsupported or as an expected failure, but didn't know how to 
go about implementing that. I don't mind following up with an additional 
patch if such a change is desirable. I took a look at 
testsuite/lib/libstdc++.exp and it looks like it may be pretty straight 
forward to add effective-target support. It would probably be a good 
learning experience for me. I'll prototype and report back.


I suppose strictly speaking we should not define __cpp_lib_char8_t 
unless these two functions are present in libc. But I'm not sure we 
want to change that now either.


All of libstdc++, libc++, and MS STL have been defining 
__cpp_lib_char8_t despite the absence of these functions, so yeah, I 
don't think we want to change that.


Tom.



Re: [PATCH 0/2]: C N2653 char8_t implementation

2022-01-11 Thread Tom Honermann via Gcc-patches

On 1/10/22 9:23 PM, Joseph Myers wrote:

Please repost these patches after GCC 12 branches (updated as appropriate
depending on whether the feature is accepted at the two-week Jan/Feb WG14
meeting, which doesn't yet have an agenda), since we're currently
stabilizing for the release and so not considering new features.


Thank you, Joseph. Will do!

Tom.



Re: [PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library

2022-01-11 Thread Tom Honermann via Gcc-patches

On 1/10/22 4:38 PM, Jonathan Wakely wrote:

On Mon, 10 Jan 2022 at 21:24, Tom Honermann via Libstdc++
 wrote:

On 1/10/22 8:23 AM, Jonathan Wakely wrote:


On Sat, 8 Jan 2022 at 00:42, Tom Honermann via Libstdc++
mailto:libstdc%2b...@gcc.gnu.org>> wrote:

 This patch completes implementation of the C++20 proposal P0482R6
 [1] by
 adding declarations of std::c8rtomb() and std::mbrtoc8() in
  if
 provided by the C library in .

 This patch addresses feedback provided in response to a previous
 patch
 submission [2].

 Autoconf changes determine if the C library declares c8rtomb and
 mbrtoc8
 at global scope when uchar.h is included and compiled with either
 -fchar8_t or -std=c++20. New
 _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T
 and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros
 reflect the probe results. The  header declares these
 functions
 in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T
 configuration macro is defined (by default it is defined if the C++20
 __cpp_char8_t feature test macro is defined)

 Patches to glibc to implement c8rtomb and mbrtoc8 have been
 submitted [3].

 New tests validate the presence of these declarations. The tests pass
 trivially if the C library does not provide these functions.
 Otherwise
 they ensure that the functions are declared when  is included
 and either -fchar8_t or -std=c++20 is enabled.

 Tested on Linux x86_64.

 libstdc++-v3/ChangeLog:

 2022-01-07  Tom Honermann  mailto:t...@honermann.net>>

 * acinclude.m4 Define config macros if uchar.h provides
 c8rtomb() and mbrtoc8().
 * config.h.in <http://config.h.in>: Re-generate.
 * configure: Re-generate.
 * include/c_compatibility/uchar.h: Declare ::c8rtomb and
 ::mbrtoc8.
 * include/c_global/cuchar: Declare std::c8rtomb and
 std::mbrtoc8.
 * include/c_std/cuchar: Declare std::c8rtomb and std::mbrtoc8.
 * testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
 New test.
 *
 testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc:
 New test.



Thanks, Tom, this looks good and I'll get it committed for GCC 12.

Thank you!

My only concern is that the new tests depend on an internal macro:

+#if _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20
+  using std::mbrtoc8;
+  using std::c8rtomb;

I prefer if tests are written as "user code" when possible, and not
using our internal macros. That isn't always possible, and in this
case would require adding new effective-target keyword to
testsuite/lib/libstdc++.exp just for use in these two tests. I don't
think we should bother with that.

I went with this approach solely due to my unfamiliarity with the test
system. I knew there should be a way to conditionally make the test
"pass" as unsupported or as an expected failure, but didn't know how to
go about implementing that. I don't mind following up with an additional
patch if such a change is desirable. I took a look at
testsuite/lib/libstdc++.exp and it looks like it may be pretty straight
forward to add effective-target support. It would probably be a good
learning experience for me. I'll prototype and report back.

Yes, it's very easy to do. Take a look at the
check_effective_target_blah procs in that file, especially the later
ones that use v3_check_preprocessor_condition. You can use that to
define an effective target keyword for any preprocessor condition
(such as the new macros you're adding).

Then the test can do:
// { dg-do compile { target blah } }
which will make it UNSUPPORTED if the effective target proc doesn't return true.
See https://gcc.gnu.org/onlinedocs/gccint/Selectors.html#Selectors for
the docs on target selectors.

I'm just not sure it's worth adding a new keyword for just two tests.


Thank you for the implementation direction; this was quite easy!

Patch attached (to be applied after the original one).

libstdc++-v3/ChangeLog:

2022-01-11  Tom Honermann  

* testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
Modify to use new c8rtomb_mbrtoc8_cxx20 effective target.
* testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc:
Modify to use new c8rtomb_mbrtoc8_fchar8_t effective target.
* testsuite/lib/libstdc++.exp: Add new effective targets.

If you decide that the new keywords aren't worth adding, no worries; my 
feelings won't be hurt :)


Tom.

commit 0542361fe8cb5da146097f86ca8ea8bca86421e0
Author: Tom Honermann 
Date:   Tue Jan 11 14:57:51 2022 -0500

Add effective target support for tests of C++20 c8rtomb and mbrtoc8.

diff --git a/libstdc++-v3/testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc b/libstdc++-v3