Re: [PATCH 3/4] libstdc++: Add floating-point std::to_chars implementation

Jonathan Wakely via Gcc-patches Mon, 20 Jul 2020 07:14:03 -0700

On 20/07/20 08:53 -0400, Patrick Palka via Libstdc++ wrote:

On Mon, 20 Jul 2020, Jonathan Wakely wrote:

On 19/07/20 23:37 -0400, Patrick Palka via Libstdc++ wrote:
> On Fri, 17 Jul 2020, Patrick Palka wrote:
>
> > On Fri, 17 Jul 2020, Patrick Palka wrote:
> >
> > > On Wed, 15 Jul 2020, Patrick Palka wrote:
> > >
> > > > On Tue, 14 Jul 2020, Patrick Palka wrote:
> > > >
> > > > > This implements the floating-point std::to_chars overloads for
> > float,
> > > > > double and long double.  We use the Ryu library to compute the
> > shortest
> > > > > round-trippable fixed and scientific forms of a number for float,
> > double
> > > > > and long double.  We also use Ryu for performing fixed and
> > scientific
> > > > > formatting of float and double. For formatting long double with an
> > > > > explicit precision argument we use a printf fallback.  Hexadecimal
> > > > > formatting for float, double and long double is implemented from
> > > > > scratch.
> > > > >
> > > > > The supported long double binary formats are float64 (same as
> > double),
> > > > > float80 (x86 extended precision), float128 and ibm128.
> > > > >
> > > > > Much of the complexity of the implementation is in computing the
> > exact
> > > > > output length before handing it off to Ryu (which doesn't do bounds
> > > > > checking).  In some cases it's hard to compute the output length
> > before
> > > > > the fact, so in these cases we instead compute an upper bound on the
> > > > > output length and use a sufficiently-sized intermediate buffer (if
> > the
> > > > > output range is smaller than the upper bound).
> > > > >
> > > > > Another source of complexity is in the general-with-precision
> > formatting
> > > > > mode, where we need to do zero-trimming of the string returned by
> > Ryu, and
> > > > > where we also take care to avoid having to format the string a
> > second
> > > > > time when the general formatting mode resolves to fixed.
> > > > >
> > > > > Tested on x86_64-pc-linux-gnu, aarch64-unknown-linux-gnu,
> > > > > s390x-ibm-linux-gnu, and powerpc64-unknown-linux-gnu.
> > > > >
> > > > > libstdc++-v3/ChangeLog:
> > > > >
> > > > >        * acinclude.m4 (libtool_VERSION): Bump to 6:29:0.
> > > > >        * config/abi/pre/gnu.ver: Add new exports.
> > > > >        * configure: Regenerate.
> > > > >        * include/std/charconv (to_chars): Declare the floating-point
> > > > >        overloads for float, double and long double.
> > > > >        * src/c++17/Makefile.am (sources): Add floating_to_chars.cc.
> > > > >        * src/c++17/Makefile.in: Regenerate.
> > > > >        * src/c++17/floating_to_chars.cc: New file.
> > > > >        * testsuite/20_util/to_chars/long_double.cc: New test.
> > > > >        * testsuite/util/testsuite_abi.cc: Add new symbol version.
> > > >
> > > > Here is v2 of this patch, which fixes a build failure on i386 due to
> > > > __int128 being unavailable, by refactoring the long double binary
> > format
> > > > selection to avoid referring to __int128 when it doesn't exist.  The
> > > > patch also makes the hex formatting for 80-bit long double use
> > uint64_t
> > > > instead of __int128 since the mantissa has exactly 64 bits in this
> > case.
> > >
> > > Here's v3 which just makes some minor stylistic adjustments, and most
> > > notably replaces the use of _GLIBCXX_DEBUG with _GLIBCXX_ASSERTIONS
> > > since we just want to enable __glibcxx_assert and not all of debug mode.
> >
> > Here's v4, which should now correctly support using <charconv> with
> > -mlong-double-64 on targets with a large default long double type.
> > This is done by defining the long double to_chars overloads as inline
> > wrappers around the double overloads within <charconv> whenever
> > __DBL_MANT_DIG__ equals __LDBL_MANT_DIG__.
>
> >
> > -- >8 --
> >
> > Subject: [PATCH 3/4] libstdc++: Add floating-point std::to_chars
> >  implementation
> >
> > This implements the floating-point std::to_chars overloads for float,
> > double and long double.  We use the Ryu library to compute the shortest
> > round-trippable fixed and scientific forms of a number for float, double
> > and long double.  We also use Ryu for performing explicit-precision
> > fixed and scientific formatting of float and double. For
> > explicit-precision formatting of long double we fall back to using
> > printf.  Hexadecimal formatting for float, double and long double is
> > implemented from scratch.
> >
> > The supported long double binary formats are binary64, binary80 (x86
> > 80-bit extended precision), binary128 and ibm128.
> >
> > Much of the complexity of the implementation is in computing the exact
> > output length before handing it off to Ryu (which doesn't do bounds
> > checking).  In some cases it's hard to compute the output length
> > beforehand, so in these cases we instead compute an upper bound on the
> > output length and use a sufficiently-sized intermediate buffer if
> > necessary.
> >
> > Another source of complexity is in the general-with-precision formatting
> > mode, where we need to do zero-trimming of the string returned by Ryu,
> > and where we also take care to avoid having to format the string a
> > second time when the general formatting mode resolves to fixed.
> >
> > This implementation is non-conforming in a couple of ways:
> >
> > 1. For the shortest hexadecimal formatting, we currently follow the
> >    Microsoft implementation's approach of being consistent with the
> >    output of printf's '%a' specifier at the expense of sometimes not
> >    printing the shortest representation.  For example, the shortest hex
> >    form of 1.08p+0 is 2.1p-1, but we output the former instead of the
> >    latter, as does printf.
> >
> > 2. The Ryu routines for doing shortest formatting on types larger than
> >    binary64 use the __int128 type, and some targets (e.g. i386) have a
> >    large long double type but lack __int128.  For such targets we make
> >    the long double to_chars overloads go through the double overloads,
> >    which means we lose precision in the output.  (The mantissa of long
> >    double is 64 bits on i386, so I think we could potentially fix this
> >    by writing a specialized version of the generic Ryu formatting
> >    routine which works with uint64_t instead of __int128.)
> >
> > 3. The __ibm128 shortest formatting routines don't guarantee
> >    round-trippability if the difference between the high- and low-order
> >    exponent is too large.  This is because we treat the type as if it
> >    has a contiguous 105-bit mantissa by merging the high- and low-order
> >    mantissas, so we potentially lose precision from the low-order part.
> >    Although this precision-dropping behavior is non-conforming, it seems
> >    consistent with how printf formats __ibm128.
> >
> > libstdc++-v3/ChangeLog:
> >
> >       * acinclude.m4 (libtool_VERSION): Bump to 6:29:0.
> >       * config/abi/pre/gnu.ver: Add new exports.
> >       * configure: Regenerate.
> >       * include/std/charconv (to_chars): Declare the floating-point
> >       overloads for float, double and long double.
> >       * src/c++17/Makefile.am (sources): Add floating_to_chars.cc.
> >       * src/c++17/Makefile.in: Regenerate.
> >       * src/c++17/floating_to_chars.cc: New file.
> >       * testsuite/20_util/to_chars/long_double.cc: New test.
> >       * testsuite/util/testsuite_abi.cc: Add new symbol version.
> > ---
> >  libstdc++-v3/acinclude.m4                     |    2 +-
> >  libstdc++-v3/config/abi/pre/gnu.ver           |   12 +
> >  libstdc++-v3/configure                        |    2 +-
> >  libstdc++-v3/include/std/charconv             |   43 +
> >  libstdc++-v3/src/c++17/Makefile.am            |    1 +
> >  libstdc++-v3/src/c++17/Makefile.in            |    5 +-
> >  libstdc++-v3/src/c++17/floating_to_chars.cc   | 1514 +++++++++++++++++
> >  .../testsuite/20_util/to_chars/long_double.cc |  197 +++
> >  libstdc++-v3/testsuite/util/testsuite_abi.cc  |    3 +-
> >  9 files changed, 1774 insertions(+), 5 deletions(-)
> >  create mode 100644 libstdc++-v3/src/c++17/floating_to_chars.cc
> >  create mode 100644 libstdc++-v3/testsuite/20_util/to_chars/long_double.cc
> >
> > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> > index ee5e0336f2c..e3926e1c9c2 100644
> > --- a/libstdc++-v3/acinclude.m4
> > +++ b/libstdc++-v3/acinclude.m4
> > @@ -3846,7 +3846,7 @@ changequote([,])dnl
> >  fi
> >
> >  # For libtool versioning info, format is CURRENT:REVISION:AGE
> > -libtool_VERSION=6:28:0
> > +libtool_VERSION=6:29:0
> >
> >  # Everything parsed; figure out what files and settings to use.
> >  case $enable_symvers in
> > diff --git a/libstdc++-v3/config/abi/pre/gnu.ver
> > b/libstdc++-v3/config/abi/pre/gnu.ver
> > index edf4485e607..9a1bcfd25d1 100644
> > --- a/libstdc++-v3/config/abi/pre/gnu.ver
> > +++ b/libstdc++-v3/config/abi/pre/gnu.ver
> > @@ -2299,6 +2299,18 @@ GLIBCXX_3.4.28 {
> >
> >  } GLIBCXX_3.4.27;
> >
> > +GLIBCXX_3.4.29 {
> > +    # to_chars(char*, char*, [float|double|long double])
> > +    _ZSt8to_charsPcS_[fdeg];
> > +
> > +    # to_chars(char*, char*, [float|double|long double], chars_format)
> > +    _ZSt8to_charsPcS_[fdeg]St12chars_format;
> > +
> > +    # to_chars(char*, char*, [float|double|long double], chars_format,
> > int)
> > +    _ZSt8to_charsPcS_[fdeg]St12chars_formati;
> > +
> > +} GLIBCXX_3.4.28;
> > +
> >  # Symbols in the support library (libsupc++) have their own tag.
> >  CXXABI_1.3 {
> >
> > diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
> > index dd54bd406a9..73f771e7335 100755
> > --- a/libstdc++-v3/configure
> > +++ b/libstdc++-v3/configure
> > @@ -75231,7 +75231,7 @@ $as_echo "$as_me: WARNING: === Symbol versioning
> > will be disabled." >&2;}
> >  fi
> >
> >  # For libtool versioning info, format is CURRENT:REVISION:AGE
> > -libtool_VERSION=6:28:0
> > +libtool_VERSION=6:29:0
> >
> >  # Everything parsed; figure out what files and settings to use.
> >  case $enable_symvers in
> > diff --git a/libstdc++-v3/include/std/charconv
> > b/libstdc++-v3/include/std/charconv
> > index cc7dd0e3758..bd59924f7e7 100644
> > --- a/libstdc++-v3/include/std/charconv
> > +++ b/libstdc++-v3/include/std/charconv
> > @@ -688,6 +688,49 @@ namespace __detail
> >    operator^=(chars_format& __lhs, chars_format __rhs) noexcept
> >    { return __lhs = __lhs ^ __rhs; }
> >
> > +  // Floating-point std::to_chars
> > +
> > +  // Overloads for float.
> > +  to_chars_result to_chars(char* __first, char* __last, float __value)
> > noexcept;
> > +  to_chars_result to_chars(char* __first, char* __last, float __value,
> > +                        chars_format __fmt) noexcept;
> > +  to_chars_result to_chars(char* __first, char* __last, float __value,
> > +                        chars_format __fmt, int __precision) noexcept;
> > +
> > +  // Overloads for double.
> > +  to_chars_result to_chars(char* __first, char* __last, double __value)
> > noexcept;
> > +  to_chars_result to_chars(char* __first, char* __last, double __value,
> > +                        chars_format __fmt) noexcept;
> > +  to_chars_result to_chars(char* __first, char* __last, double __value,
> > +                        chars_format __fmt, int __precision) noexcept;
> > +
> > +  // Overloads for long double.
> > +  to_chars_result to_chars(char* __first, char* __last, long double
> > __value)
> > +    noexcept;
> > +  to_chars_result to_chars(char* __first, char* __last, long double
> > __value,
> > +                        chars_format __fmt) noexcept;
> > +  to_chars_result to_chars(char* __first, char* __last, long double
> > __value,
> > +                        chars_format __fmt, int __precision) noexcept;
> > +
> > +  // If long double has the same binary format as double, then we just
> > define
> > +  // the long double overloads as wrappers around the corresponding
> > double
> > +  // overloads.
> > +#if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__
> > +  inline to_chars_result
> > +  to_chars(char* __first, char* __last, long double __value) noexcept
> > +  { return to_chars(__first, __last, double(__value)); }
> > +
> > +  inline to_chars_result
> > +  to_chars(char* __first, char* __last, long double __value,
> > +        chars_format __fmt) noexcept
> > +  { return to_chars(__first, __last, double(__value), __fmt); }
> > +
> > +  inline to_chars_result
> > +  to_chars(char* __first, char* __last, long double __value,
> > +        chars_format __fmt, int __precision) noexcept
> > +  { return to_chars(__first, __last, double(__value), __fmt,
> > __precision); }
> > +#endif
>
> Hmm, I think this approach for supporting -mlong-double-64 might
> introduce an ODR violation because each long double to_chars overload
> could potentially have two different definitions available in a program,
> one out-of-line in floating_to_chars.cc (compiled without
> -mlong-double-64) and another inline in <charconv> (compiled with
> -mlong-double-64)..

But they have different mangled names, so there's no ODR violation.
The 64-bit long double is mangled as 'e' and the 128-bit long double
is mangled as __float128. You *will* get an ODR violation on targets
where there's no -mlong-double-64 switch, where double and long double
are always the same representation.

What I'm doing for std::from_chars is adding this in the new
src/c++17/floating_from_chars.cc file:

#ifdef _GLIBCXX_LONG_DOUBLE_COMPAT
#pragma GCC diagnostic ignored "-Wattribute-alias"
extern "C" from_chars_result _ZSt10from_charsPKcS0_ReSt12chars_format(double)
__attribute__((alias ("_ZSt10from_charsPKcS0_RdSt12chars_format")));
#endif

This just defines the _ZSt10from_charsPKcS0_ReSt12chars_format symbol
(i.e. from_chars for 64-bit long double) as an alias of
_ZSt10from_charsPKcS0_RdSt12chars_format (i.e. from_chars for 64-bit
double).


Aha, that makes sense.  I'll follow suit for std::to_chars.


Actually that should be:

#ifdef _GLIBCXX_LONG_DOUBLE_COMPAT
extern "C" from_chars_result
_ZSt10from_charsPKcS0_ReSt12chars_format(const char* first, const char* last,
                                         long double& value,
                                         chars_format fmt) noexcept
__attribute__((alias ("_ZSt10from_charsPKcS0_RdSt12chars_format")));
#endif

With the right parameter list I don't need to disable the warning.

Re: [PATCH 3/4] libstdc++: Add floating-point std::to_chars implementation

Reply via email to