https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109162

--- Comment #10 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tomasz Kaminski <tkami...@gcc.gnu.org>:

https://gcc.gnu.org/g:3b33d792cf1e4d2ea3d36d3ad403cbb452243cd8

commit r15-9377-g3b33d792cf1e4d2ea3d36d3ad403cbb452243cd8
Author: Tomasz KamiÅski <tkami...@redhat.com>
Date:   Wed Apr 2 14:19:26 2025 +0200

    libstdc++: Implement debug format for strings and characters formatters
[PR109162]

    This patch implements part P2286R8 that specified debug (escaped)
    format for the strings and characters sequences. This include both
    handling of the '?' format specifier and set_debug_format member.

    To indicate partial support we define __glibcxx_format_ranges macro
    value 1, without defining __cpp_lib_format_ranges.

    We provide two separate escaping routines depending on the literal
    encoding for the corresponding character types. If the character
    encoding is Unicode, we follow the specification for the standard
    (__format::__write_escaped_unicode).
    For other encodings, we escape only characters in range [0x00, 0x80),
    interpreting them as ASCII values: [0x00, 0x20), 0x7f and  '\t', '\r',
    '\n', '\\', '"', '\'' are escaped. We assume every character outside
    this range is printable (__format::_write_escaped_ascii).
    In particular we do not yet implement special handling of shift
    sequences.

    For Unicode escaping a new __unicode::__escape_edges table is introduced,
    that encodes information if character belongs to General_Category that is
    escaped by the standard (Control or Other). This table is generated from
    DerivedGeneralCategory.txt provided by Unicode. Only boolean flag is
    preserved to reduce the number of entries. The additional rules for
escaping
    are handled by __format::__should_escape_unicode.

    When width or precision is specified, we emit escaped string to the
temporary
    buffer and format the resulting string according to the format spec.
    For characters use a fixed size stack buffer, for which a new
_Fixedbuf_sink is
    introduced. For strings, we use _Str_sink and to avoid allocations,
    we compute the estimated size of (possibly truncated) input, and if it is
    larger than width field we print directly.

            PR libstdc++/109162

    contrib/ChangeLog:

            * unicode/README: Mentioned DerivedGeneralCategory.txt.
            * unicode/gen_libstdcxx_unicode_data.py: Generation __escape_edges
            table from DerivedGeneralCategory.txt. Update file name in
comments.
            * unicode/DerivedGeneralCategory.txt: Copy of file distributed by
            Unicode Consortium.

    libstdc++-v3/ChangeLog:

            * include/bits/chrono_io.h (__detail::_Widen): Moved to std/format
file.
            * include/bits/unicode-data.h: Regnerate.
            * include/bits/unicode.h (__unicode::_Utf_iterator::_M_units)
            (__unicode::__should_escape_category): Define.
            * include/std/format (_GLIBCXX_WIDEN_, _GLIBCXX_WIDEN): Copied from
            include/bits/chrono_io.h.
            (__format::_Widen): Moved from include/bits/chrono_io.h.
            (__format::_Term_char, __format::_Escapes, __format::_Separators)
            (__format::__should_escape_ascii,
__format::__should_escape_unicode)
            (__format::__write_escape_seq, __format::__write_escaped_char)
            (__format::__write_escaped_acii, __format::__write_escaped_unicode)
            (__format::__write_escaped): Define.
            (__formatter_str::_S_trunc): Extracted truncation of character
            sequences.
            (__formatter_str::format): Handle _Pres_esc.
            (__formatter_int::_M_do_parse) [__glibcxx_format_ranges]: Parse
'?'.
            (__formatter_int::_M_format_character_escaped): Define.
            (formatter<_CharT, _CharT>::format, formatter<char,
wchar_t>::format):
            Handle _Pres_esc.
            (__formatter_str::set_debug_format,
formatter<...>::set_debug_format)
            Guard with __glibcxx_format_ranges.
            (__format::_Fixedbuf_sink): Define.
            * testsuite/23_containers/vector/bool/format.cc: Use
__format::_Widen
            and remove unnecessary <chrono> include.
            * testsuite/std/format/debug.cc: New test.
            * testsuite/std/format/debug_nonunicode.cc: New test.
            * testsuite/std/format/parse_ctx.cc (escaped_strings_supported):
Define
            to true if __glibcxx_format_ranges is defined.
            * testsuite/std/format/string.cc (escaped_strings_supported):
Define to
            true if __glibcxx_format_ranges is defined.

    Reviewed-by: Jonathan Wakely <jwak...@redhat.com>
    Signed-off-by: Tomasz KamiÅski <tkami...@redhat.com>

Reply via email to