Re: [PATCH 3/4] libstdc++: Add floating-point std::to_chars implementation

Jonathan Wakely via Gcc-patches Mon, 20 Jul 2020 05:32:02 -0700

On 19/07/20 23:37 -0400, Patrick Palka via Libstdc++ wrote:

On Fri, 17 Jul 2020, Patrick Palka wrote:

On Fri, 17 Jul 2020, Patrick Palka wrote:

> On Wed, 15 Jul 2020, Patrick Palka wrote:
>
> > On Tue, 14 Jul 2020, Patrick Palka wrote:
> >
> > > This implements the floating-point std::to_chars overloads for float,
> > > double and long double.  We use the Ryu library to compute the shortest
> > > round-trippable fixed and scientific forms of a number for float, double
> > > and long double.  We also use Ryu for performing fixed and scientific
> > > formatting of float and double. For formatting long double with an
> > > explicit precision argument we use a printf fallback.  Hexadecimal
> > > formatting for float, double and long double is implemented from
> > > scratch.
> > >
> > > The supported long double binary formats are float64 (same as double),
> > > float80 (x86 extended precision), float128 and ibm128.
> > >
> > > Much of the complexity of the implementation is in computing the exact
> > > output length before handing it off to Ryu (which doesn't do bounds
> > > checking).  In some cases it's hard to compute the output length before
> > > the fact, so in these cases we instead compute an upper bound on the
> > > output length and use a sufficiently-sized intermediate buffer (if the
> > > output range is smaller than the upper bound).
> > >
> > > Another source of complexity is in the general-with-precision formatting
> > > mode, where we need to do zero-trimming of the string returned by Ryu, and
> > > where we also take care to avoid having to format the string a second
> > > time when the general formatting mode resolves to fixed.
> > >
> > > Tested on x86_64-pc-linux-gnu, aarch64-unknown-linux-gnu,
> > > s390x-ibm-linux-gnu, and powerpc64-unknown-linux-gnu.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > >  * acinclude.m4 (libtool_VERSION): Bump to 6:29:0.
> > >  * config/abi/pre/gnu.ver: Add new exports.
> > >  * configure: Regenerate.
> > >  * include/std/charconv (to_chars): Declare the floating-point
> > >  overloads for float, double and long double.
> > >  * src/c++17/Makefile.am (sources): Add floating_to_chars.cc.
> > >  * src/c++17/Makefile.in: Regenerate.
> > >  * src/c++17/floating_to_chars.cc: New file.
> > >  * testsuite/20_util/to_chars/long_double.cc: New test.
> > >  * testsuite/util/testsuite_abi.cc: Add new symbol version.
> >
> > Here is v2 of this patch, which fixes a build failure on i386 due to
> > __int128 being unavailable, by refactoring the long double binary format
> > selection to avoid referring to __int128 when it doesn't exist.  The
> > patch also makes the hex formatting for 80-bit long double use uint64_t
> > instead of __int128 since the mantissa has exactly 64 bits in this case.
>
> Here's v3 which just makes some minor stylistic adjustments, and most
> notably replaces the use of _GLIBCXX_DEBUG with _GLIBCXX_ASSERTIONS
> since we just want to enable __glibcxx_assert and not all of debug mode.

Here's v4, which should now correctly support using <charconv> with
-mlong-double-64 on targets with a large default long double type.
This is done by defining the long double to_chars overloads as inline
wrappers around the double overloads within <charconv> whenever
__DBL_MANT_DIG__ equals __LDBL_MANT_DIG__.


-- >8 --

Subject: [PATCH 3/4] libstdc++: Add floating-point std::to_chars
 implementation

This implements the floating-point std::to_chars overloads for float,
double and long double.  We use the Ryu library to compute the shortest
round-trippable fixed and scientific forms of a number for float, double
and long double.  We also use Ryu for performing explicit-precision
fixed and scientific formatting of float and double. For
explicit-precision formatting of long double we fall back to using
printf.  Hexadecimal formatting for float, double and long double is
implemented from scratch.

The supported long double binary formats are binary64, binary80 (x86
80-bit extended precision), binary128 and ibm128.

Much of the complexity of the implementation is in computing the exact
output length before handing it off to Ryu (which doesn't do bounds
checking).  In some cases it's hard to compute the output length
beforehand, so in these cases we instead compute an upper bound on the
output length and use a sufficiently-sized intermediate buffer if
necessary.

Another source of complexity is in the general-with-precision formatting
mode, where we need to do zero-trimming of the string returned by Ryu,
and where we also take care to avoid having to format the string a
second time when the general formatting mode resolves to fixed.

This implementation is non-conforming in a couple of ways:

1. For the shortest hexadecimal formatting, we currently follow the
   Microsoft implementation's approach of being consistent with the
   output of printf's '%a' specifier at the expense of sometimes not
   printing the shortest representation.  For example, the shortest hex
   form of 1.08p+0 is 2.1p-1, but we output the former instead of the
   latter, as does printf.

2. The Ryu routines for doing shortest formatting on types larger than
   binary64 use the __int128 type, and some targets (e.g. i386) have a
   large long double type but lack __int128.  For such targets we make
   the long double to_chars overloads go through the double overloads,
   which means we lose precision in the output.  (The mantissa of long
   double is 64 bits on i386, so I think we could potentially fix this
   by writing a specialized version of the generic Ryu formatting
   routine which works with uint64_t instead of __int128.)

3. The __ibm128 shortest formatting routines don't guarantee
   round-trippability if the difference between the high- and low-order
   exponent is too large.  This is because we treat the type as if it
   has a contiguous 105-bit mantissa by merging the high- and low-order
   mantissas, so we potentially lose precision from the low-order part.
   Although this precision-dropping behavior is non-conforming, it seems
   consistent with how printf formats __ibm128.

libstdc++-v3/ChangeLog:

        * acinclude.m4 (libtool_VERSION): Bump to 6:29:0.
        * config/abi/pre/gnu.ver: Add new exports.
        * configure: Regenerate.
        * include/std/charconv (to_chars): Declare the floating-point
        overloads for float, double and long double.
        * src/c++17/Makefile.am (sources): Add floating_to_chars.cc.
        * src/c++17/Makefile.in: Regenerate.
        * src/c++17/floating_to_chars.cc: New file.
        * testsuite/20_util/to_chars/long_double.cc: New test.
        * testsuite/util/testsuite_abi.cc: Add new symbol version.
---
 libstdc++-v3/acinclude.m4                     |    2 +-
 libstdc++-v3/config/abi/pre/gnu.ver           |   12 +
 libstdc++-v3/configure                        |    2 +-
 libstdc++-v3/include/std/charconv             |   43 +
 libstdc++-v3/src/c++17/Makefile.am            |    1 +
 libstdc++-v3/src/c++17/Makefile.in            |    5 +-
 libstdc++-v3/src/c++17/floating_to_chars.cc   | 1514 +++++++++++++++++
 .../testsuite/20_util/to_chars/long_double.cc |  197 +++
 libstdc++-v3/testsuite/util/testsuite_abi.cc  |    3 +-
 9 files changed, 1774 insertions(+), 5 deletions(-)
 create mode 100644 libstdc++-v3/src/c++17/floating_to_chars.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/to_chars/long_double.cc

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index ee5e0336f2c..e3926e1c9c2 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -3846,7 +3846,7 @@ changequote([,])dnl
 fi

 # For libtool versioning info, format is CURRENT:REVISION:AGE
-libtool_VERSION=6:28:0
+libtool_VERSION=6:29:0

 # Everything parsed; figure out what files and settings to use.
 case $enable_symvers in
diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index edf4485e607..9a1bcfd25d1 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -2299,6 +2299,18 @@ GLIBCXX_3.4.28 {

 } GLIBCXX_3.4.27;

+GLIBCXX_3.4.29 {
+    # to_chars(char*, char*, [float|double|long double])
+    _ZSt8to_charsPcS_[fdeg];
+
+    # to_chars(char*, char*, [float|double|long double], chars_format)
+    _ZSt8to_charsPcS_[fdeg]St12chars_format;
+
+    # to_chars(char*, char*, [float|double|long double], chars_format, int)
+    _ZSt8to_charsPcS_[fdeg]St12chars_formati;
+
+} GLIBCXX_3.4.28;
+
 # Symbols in the support library (libsupc++) have their own tag.
 CXXABI_1.3 {

diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index dd54bd406a9..73f771e7335 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -75231,7 +75231,7 @@ $as_echo "$as_me: WARNING: === Symbol versioning will be 
disabled." >&2;}
 fi

 # For libtool versioning info, format is CURRENT:REVISION:AGE
-libtool_VERSION=6:28:0
+libtool_VERSION=6:29:0

 # Everything parsed; figure out what files and settings to use.
 case $enable_symvers in
diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index cc7dd0e3758..bd59924f7e7 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -688,6 +688,49 @@ namespace __detail
   operator^=(chars_format& __lhs, chars_format __rhs) noexcept
   { return __lhs = __lhs ^ __rhs; }

+  // Floating-point std::to_chars
+
+  // Overloads for float.
+  to_chars_result to_chars(char* __first, char* __last, float __value) 
noexcept;
+  to_chars_result to_chars(char* __first, char* __last, float __value,
+                          chars_format __fmt) noexcept;
+  to_chars_result to_chars(char* __first, char* __last, float __value,
+                          chars_format __fmt, int __precision) noexcept;
+
+  // Overloads for double.
+  to_chars_result to_chars(char* __first, char* __last, double __value) 
noexcept;
+  to_chars_result to_chars(char* __first, char* __last, double __value,
+                          chars_format __fmt) noexcept;
+  to_chars_result to_chars(char* __first, char* __last, double __value,
+                          chars_format __fmt, int __precision) noexcept;
+
+  // Overloads for long double.
+  to_chars_result to_chars(char* __first, char* __last, long double __value)
+    noexcept;
+  to_chars_result to_chars(char* __first, char* __last, long double __value,
+                          chars_format __fmt) noexcept;
+  to_chars_result to_chars(char* __first, char* __last, long double __value,
+                          chars_format __fmt, int __precision) noexcept;
+
+  // If long double has the same binary format as double, then we just define
+  // the long double overloads as wrappers around the corresponding double
+  // overloads.
+#if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__
+  inline to_chars_result
+  to_chars(char* __first, char* __last, long double __value) noexcept
+  { return to_chars(__first, __last, double(__value)); }
+
+  inline to_chars_result
+  to_chars(char* __first, char* __last, long double __value,
+          chars_format __fmt) noexcept
+  { return to_chars(__first, __last, double(__value), __fmt); }
+
+  inline to_chars_result
+  to_chars(char* __first, char* __last, long double __value,
+          chars_format __fmt, int __precision) noexcept
+  { return to_chars(__first, __last, double(__value), __fmt, __precision); }
+#endif


Hmm, I think this approach for supporting -mlong-double-64 might
introduce an ODR violation because each long double to_chars overload
could potentially have two different definitions available in a program,
one out-of-line in floating_to_chars.cc (compiled without
-mlong-double-64) and another inline in <charconv> (compiled with
-mlong-double-64)..


But they have different mangled names, so there's no ODR violation.
The 64-bit long double is mangled as 'e' and the 128-bit long double
is mangled as __float128. You *will* get an ODR violation on targets
where there's no -mlong-double-64 switch, where double and long double
are always the same representation.

What I'm doing for std::from_chars is adding this in the new
src/c++17/floating_from_chars.cc file:

#ifdef _GLIBCXX_LONG_DOUBLE_COMPAT
#pragma GCC diagnostic ignored "-Wattribute-alias"
extern "C" from_chars_result _ZSt10from_charsPKcS0_ReSt12chars_format(double)
__attribute__((alias ("_ZSt10from_charsPKcS0_RdSt12chars_format")));
#endif

This just defines the _ZSt10from_charsPKcS0_ReSt12chars_format symbol
(i.e. from_chars for 64-bit long double) as an alias of
_ZSt10from_charsPKcS0_RdSt12chars_format (i.e. from_chars for 64-bit

double).

Re: [PATCH 3/4] libstdc++: Add floating-point std::to_chars implementation

Reply via email to