https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77573
David Malcolm <dmalcolm at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dmalcolm at gcc dot gnu.org --- Comment #1 from David Malcolm <dmalcolm at gcc dot gnu.org> --- http://en.cppreference.com/w/cpp/language/escape says: "Hexadecimal escape sequences have no length limit and terminate at the first character that is not a valid hexadecimal digit." These are 4-byte wchars, so the value fits. emit_numeric_escape is called twice, once with 0x12345678, then with 0 for the implicit terminator. (gdb) p tbuf $45 = {text = 0x23e77f0 "xV4\022", asize = 256, len = 8} (gdb) p tbuf->text[0] $37 = 120 'x' (gdb) p tbuf->text[1] $38 = 86 'V' (gdb) p tbuf->text[2] $39 = 52 '4' (gdb) p tbuf->text[3] $40 = 18 '\022' Note that "xV4\022" is 0x12345678: (gdb) p /x tbuf->text[0] $46 = 0x78 (gdb) p /x tbuf->text[1] $47 = 0x56 (gdb) p /x tbuf->text[2] $48 = 0x34 (gdb) p /x tbuf->text[3] $49 = 0x12 ...and then the terminator: (gdb) p tbuf->text[4] $41 = 0 '\000' (gdb) p tbuf->text[5] $42 = 0 '\000' (gdb) p tbuf->text[6] $43 = 0 '\000' (gdb) p tbuf->text[7] $44 = 0 '\000' So I think that the sequence that's printed is valid. If I'm reading the following right, internally it's stored as a conversion of a one-byte-per-char array string to a wchar_t: (gdb) call debug_tree(t) <convert_expr 0x7ffff1a2b5c0 type <integer_type 0x7ffff18d5690 wchar_t type_6 SI size <integer_cst 0x7ffff18cd0d8 constant 32> unit size <integer_cst 0x7ffff18cd0f0 constant 4> align 32 symtab 0 alias set -1 canonical type 0x7ffff18d5690 precision 32 min <integer_cst 0x7ffff18cd468 -2147483648> max <integer_cst 0x7ffff18cd480 2147483647>> readonly constant arg 0 <nop_expr 0x7ffff1a2b5a0 type <pointer_type 0x7ffff1a17f18 type <integer_type 0x7ffff1a17bd0 wchar_t> unsigned DI size <integer_cst 0x7ffff18abe88 constant 64> unit size <integer_cst 0x7ffff18abea0 constant 8> align 64 symtab 0 alias set -1 canonical type 0x7ffff1a17f18> readonly constant arg 0 <addr_expr 0x7ffff1a2b580 type <pointer_type 0x7ffff1a17a80> readonly constant arg 0 <string_cst 0x7ffff1a2b560 type <array_type 0x7ffff1a17e70> readonly constant static "xV4\022\000\000\000\000">>>> (gdb) call debug_tree((tree)0x7ffff1a2b560) <string_cst 0x7ffff1a2b560 type <array_type 0x7ffff1a17e70 type <integer_type 0x7ffff1a17bd0 wchar_t readonly type_6 SI size <integer_cst 0x7ffff18cd0d8 constant 32> unit size <integer_cst 0x7ffff18cd0f0 constant 4> align 32 symtab 0 alias set -1 canonical type 0x7ffff1a17bd0 precision 32 min <integer_cst 0x7ffff18cd468 -2147483648> max <integer_cst 0x7ffff18cd480 2147483647> pointer_to_this <pointer_type 0x7ffff1a17f18>> DI size <integer_cst 0x7ffff18abe88 constant 64> unit size <integer_cst 0x7ffff18abea0 constant 8> align 32 symtab 0 alias set -1 canonical type 0x7ffff1a17e70 domain <integer_type 0x7ffff1a17c78 type <integer_type 0x7ffff18ca000 sizetype> type_6 DI size <integer_cst 0x7ffff18abe88 64> unit size <integer_cst 0x7ffff18abea0 8> align 64 symtab 0 alias set -1 canonical type 0x7ffff1a17c78 precision 64 min <integer_cst 0x7ffff18abeb8 0> max <integer_cst 0x7ffff18abf90 1>> pointer_to_this <pointer_type 0x7ffff1a17a80>> readonly constant static "xV4\022\000\000\000\000"> The title of this bug is "bogus wide string literals in diagnostics", but the diagnostic contains a regular string literal, not a wide string literal. Perhaps we should be printing it as something like; L"\x12345678\x00" or somesuch, for such cases. FWIW, compare with this: z.C:1:23: error: invalid conversion from ‘const wchar_t*’ to ‘wchar_t’ [-fpermissive] constexpr wchar_t s = L"pqrstuvw"; ^~~~~~~~~~~ z.C:1:23: error: ‘(wchar_t)((const wchar_t*)"p\000\000\000q\000\000\000r\000\000\000s\000\000\000t\000\000\000u\000\000\000v\000\000\000w\000\000\000\000\000\000")’ is not a constant expression