Re: [PATCH] libstdc++: Update from latest fast_float [PR107468]

Jonathan Wakely via Gcc-patches Mon, 07 Nov 2022 05:38:36 -0800

On Mon, 7 Nov 2022 at 08:19, Jakub Jelinek <[email protected]> wrote:
>
> Hi!
>
> The following patch updates from fast_float trunk.  That way
> it grabs two of the 4 LOCAL_PATCHES, some smaller tweaks, to_extended
> cleanups and most importantly fix for the incorrect rounding case,
> PR107468 aka https://github.com/fastfloat/fast_float/issues/149
> Using std::fegetround showed in benchmarks too slow, so instead of
> doing that the patch limits the fast path where it uses floating
> point multiplication rather than integral to cases where we can
> prove there will be no rounding (the multiplication will be exact, not
> just that the two multiplication or division operation arguments are
> exactly representable).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK, thanks.


>
> 2022-11-07  Jakub Jelinek  <[email protected]>
>
>         PR libstdc++/107468
>         * src/c++17/fast_float/MERGE: Adjust for merge from upstream.
>         * src/c++17/fast_float/LOCAL_PATCHES: Remove commits that were
>         upstreamed.
>         * src/c++17/fast_float/README.md: Merge from fast_float
>         662497742fea7055f0e0ee27e5a7ddc382c2c38e commit.
>         * src/c++17/fast_float/fast_float.h: Likewise.
>         * testsuite/20_util/from_chars/pr107468.cc: New test.
>
> --- libstdc++-v3/src/c++17/fast_float/MERGE.jj  2022-01-18 11:59:00.306971713 
> +0100
> +++ libstdc++-v3/src/c++17/fast_float/MERGE     2022-11-05 18:42:50.815892080 
> +0100
> @@ -1,4 +1,4 @@
> -d35368cae610b4edeec61cd41e4d2367a4d33f58
> +662497742fea7055f0e0ee27e5a7ddc382c2c38e
>
>  The first line of this file holds the git revision number of the
>  last merge done from the master library sources.
> --- libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES.jj  2022-02-04 
> 14:36:56.965577924 +0100
> +++ libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES     2022-11-05 
> 19:02:57.360336939 +0100
> @@ -1,4 +1,2 @@
>  r12-6647
>  r12-6648
> -r12-6664
> -r12-6665
> --- libstdc++-v3/src/c++17/fast_float/README.md.jj      2022-01-18 
> 11:59:00.306971713 +0100
> +++ libstdc++-v3/src/c++17/fast_float/README.md 2022-11-05 18:32:34.668345927 
> +0100
> @@ -1,12 +1,5 @@
>  ## fast_float number parsing library: 4x faster than strtod
>
> -![Ubuntu 20.04 CI (GCC 
> 9)](https://github.com/lemire/fast_float/workflows/Ubuntu%2020.04%20CI%20(GCC%209)/badge.svg)
> -![Ubuntu 18.04 CI (GCC 
> 7)](https://github.com/lemire/fast_float/workflows/Ubuntu%2018.04%20CI%20(GCC%207)/badge.svg)
> -![Alpine 
> Linux](https://github.com/lemire/fast_float/workflows/Alpine%20Linux/badge.svg)
> -![MSYS2-CI](https://github.com/lemire/fast_float/workflows/MSYS2-CI/badge.svg)
> -![VS16-CLANG-CI](https://github.com/lemire/fast_float/workflows/VS16-CLANG-CI/badge.svg)
> -[![VS16-CI](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml/badge.svg)](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml)
> -
>  The fast_float library provides fast header-only implementations for the C++ 
> from_chars
>  functions for `float` and `double` types.  These functions convert ASCII 
> strings representing
>  decimal values (e.g., `1.3e10`) into binary types. We provide exact rounding 
> (including
> @@ -28,8 +21,8 @@ struct from_chars_result {
>  ```
>
>  It parses the character sequence [first,last) for a number. It parses 
> floating-point numbers expecting
> -a locale-independent format equivalent to the C++17 from_chars function.
> -The resulting floating-point value is the closest floating-point values 
> (using either float or double),
> +a locale-independent format equivalent to the C++17 from_chars function.
> +The resulting floating-point value is the closest floating-point values 
> (using either float or double),
>  using the "round to even" convention for values that would otherwise fall 
> right in-between two values.
>  That is, we provide exact parsing according to the IEEE standard.
>
> @@ -47,7 +40,7 @@ Example:
>  ``` C++
>  #include "fast_float/fast_float.h"
>  #include <iostream>
> -
> +
>  int main() {
>      const std::string input =  "3.1416 xyz ";
>      double result;
> @@ -60,15 +53,15 @@ int main() {
>
>
>  Like the C++17 standard, the `fast_float::from_chars` functions take an 
> optional last argument of
> -the type `fast_float::chars_format`. It is a bitset value: we check whether
> +the type `fast_float::chars_format`. It is a bitset value: we check whether
>  `fmt & fast_float::chars_format::fixed` and `fmt & 
> fast_float::chars_format::scientific` are set
>  to determine whether we allow the fixed point and scientific notation 
> respectively.
>  The default is  `fast_float::chars_format::general` which allows both 
> `fixed` and `scientific`.
>
> -The library seeks to follow the C++17 (see 
> [20.19.3](http://eel.is/c++draft/charconv.from.chars).(7.1))  specification.
> +The library seeks to follow the C++17 (see 
> [20.19.3](http://eel.is/c++draft/charconv.from.chars).(7.1))  specification.
>  * The `from_chars` function does not skip leading white-space characters.
>  * [A leading `+` sign](https://en.cppreference.com/w/cpp/utility/from_chars) 
> is forbidden.
> -* It is generally impossible to represent a decimal value exactly as binary 
> floating-point number (`float` and `double` types). We seek the nearest 
> value. We round to an even mantissa when we are in-between two binary 
> floating-point numbers.
> +* It is generally impossible to represent a decimal value exactly as binary 
> floating-point number (`float` and `double` types). We seek the nearest 
> value. We round to an even mantissa when we are in-between two binary 
> floating-point numbers.
>
>  Furthermore, we have the following restrictions:
>  * We only support `float` and `double` types at this time.
> @@ -77,22 +70,22 @@ Furthermore, we have the following restr
>
>  We support Visual Studio, macOS, Linux, freeBSD. We support big and little 
> endian. We support 32-bit and 64-bit systems.
>
> -
> +We assume that the rounding mode is set to nearest (`std::fegetround() == 
> FE_TONEAREST`).
>
>  ## Using commas as decimal separator
>
>
>  The C++ standard stipulate that `from_chars` has to be locale-independent. In
> -particular, the decimal separator has to be the period (`.`). However,
> -some users still want to use the `fast_float` library with in a 
> locale-dependent
> +particular, the decimal separator has to be the period (`.`). However,
> +some users still want to use the `fast_float` library with in a 
> locale-dependent
>  manner. Using a separate function called `from_chars_advanced`, we allow the 
> users
> -to pass a `parse_options` instance which contains a custom decimal separator 
> (e.g.,
> +to pass a `parse_options` instance which contains a custom decimal separator 
> (e.g.,
>  the comma). You may use it as follows.
>
>  ```C++
>  #include "fast_float/fast_float.h"
>  #include <iostream>
> -
> +
>  int main() {
>      const std::string input =  "3,1416 xyz ";
>      double result;
> @@ -104,25 +97,55 @@ int main() {
>  }
>  ```
>
> +You can parse delimited numbers:
> +```C++
> +  const std::string input =   "234532.3426362,7869234.9823,324562.645";
> +  double result;
> +  auto answer = fast_float::from_chars(input.data(), 
> input.data()+input.size(), result);
> +  if(answer.ec != std::errc()) {
> +    // check error
> +  }
> +  // we have result == 234532.3426362.
> +  if(answer.ptr[0] != ',') {
> +    // unexpected delimiter
> +  }
> +  answer = fast_float::from_chars(answer.ptr + 1, input.data()+input.size(), 
> result);
> +  if(answer.ec != std::errc()) {
> +    // check error
> +  }
> +  // we have result == 7869234.9823.
> +  if(answer.ptr[0] != ',') {
> +    // unexpected delimiter
> +  }
> +  answer = fast_float::from_chars(answer.ptr + 1, input.data()+input.size(), 
> result);
> +  if(answer.ec != std::errc()) {
> +    // check error
> +  }
> +  // we have result == 324562.645.
> +```
>
>  ## Reference
>
> -- Daniel Lemire, [Number Parsing at a Gigabyte per 
> Second](https://arxiv.org/abs/2101.11408), Software: Pratice and Experience 
> 51 (8), 2021.
> +- Daniel Lemire, [Number Parsing at a Gigabyte per 
> Second](https://arxiv.org/abs/2101.11408), Software: Practice and Experience 
> 51 (8), 2021.
>
>  ## Other programming languages
>
>  - [There is an R binding](https://github.com/eddelbuettel/rcppfastfloat) 
> called `rcppfastfloat`.
>  - [There is a Rust port of the fast_float 
> library](https://github.com/aldanor/fast-float-rust/) called 
> `fast-float-rust`.
> -- [There is a Java port of the fast_float 
> library](https://github.com/wrandelshofer/FastDoubleParser) called 
> `FastDoubleParser`.
> +- [There is a Java port of the fast_float 
> library](https://github.com/wrandelshofer/FastDoubleParser) called 
> `FastDoubleParser`. It used for important systems such as 
> [Jackson](https://github.com/FasterXML/jackson-core).
>  - [There is a C# port of the fast_float 
> library](https://github.com/CarlVerret/csFastFloat) called `csFastFloat`.
>
>
>  ## Relation With Other Work
>
> -The fastfloat algorithm is part of the [LLVM standard 
> libraries](https://github.com/llvm/llvm-project/commit/87c016078ad72c46505461e4ff8bfa04819fe7ba).
> +The fast_float library is part of GCC (as of version 12): the `from_chars` 
> function in GCC relies on fast_float.
> +
> +The fastfloat algorithm is part of the [LLVM standard 
> libraries](https://github.com/llvm/llvm-project/commit/87c016078ad72c46505461e4ff8bfa04819fe7ba).
>
>  The fast_float library provides a performance similar to that of the 
> [fast_double_parser](https://github.com/lemire/fast_double_parser) library 
> but using an updated algorithm reworked from the ground up, and while 
> offering an API more in line with the expectations of C++ programmers. The 
> fast_double_parser library is part of the [Microsoft LightGBM 
> machine-learning framework](https://github.com/microsoft/LightGBM).
>
> +There is a [derived implementation part of 
> AdaCore](https://github.com/AdaCore/VSS).
> +
>  ## Users
>
>  The fast_float library is used by [Apache 
> Arrow](https://github.com/apache/arrow/pull/8494) where it multiplied the 
> number parsing speed by two or three times. It is also used by [Yandex 
> ClickHouse](https://github.com/ClickHouse/ClickHouse) and by [Google 
> Jsonnet](https://github.com/google/jsonnet).
> @@ -135,14 +158,14 @@ It can parse random floating-point numbe
>  <img 
> src="http://lemire.me/blog/wp-content/uploads/2020/11/fastfloat_speed.png"; 
> width="400">
>
>  ```
> -$ ./build/benchmarks/benchmark
> +$ ./build/benchmarks/benchmark
>  # parsing random integers in the range [0,1)
> -volume = 2.09808 MB
> -netlib                                  :   271.18 MB/s (+/- 1.2 %)    12.93 
> Mfloat/s
> -doubleconversion                        :   225.35 MB/s (+/- 1.2 %)    10.74 
> Mfloat/s
> -strtod                                  :   190.94 MB/s (+/- 1.6 %)     9.10 
> Mfloat/s
> -abseil                                  :   430.45 MB/s (+/- 2.2 %)    20.52 
> Mfloat/s
> -fastfloat                               :  1042.38 MB/s (+/- 9.9 %)    49.68 
> Mfloat/s
> +volume = 2.09808 MB
> +netlib                                  :   271.18 MB/s (+/- 1.2 %)    12.93 
> Mfloat/s
> +doubleconversion                        :   225.35 MB/s (+/- 1.2 %)    10.74 
> Mfloat/s
> +strtod                                  :   190.94 MB/s (+/- 1.6 %)     9.10 
> Mfloat/s
> +abseil                                  :   430.45 MB/s (+/- 2.2 %)    20.52 
> Mfloat/s
> +fastfloat                               :  1042.38 MB/s (+/- 9.9 %)    49.68 
> Mfloat/s
>  ```
>
>  See https://github.com/lemire/simple_fastfloat_benchmark for our 
> benchmarking code.
> @@ -183,23 +206,23 @@ You should change the `GIT_TAG` line so
>
>  ## Using as single header
>
> -The script `script/amalgamate.py` may be used to generate a single header
> +The script `script/amalgamate.py` may be used to generate a single header
>  version of the library if so desired.
> -Just run the script from the root directory of this repository.
> +Just run the script from the root directory of this repository.
>  You can customize the license type and output file if desired as described in
>  the command line help.
>
>  You may directly download automatically generated single-header files:
>
> -https://github.com/fastfloat/fast_float/releases/download/v1.1.2/fast_float.h
> +https://github.com/fastfloat/fast_float/releases/download/v3.4.0/fast_float.h
>
>  ## Credit
>
> -Though this work is inspired by many different people, this work benefited 
> especially from exchanges with
> -Michael Eisel, who motivated the original research with his key insights, 
> and with Nigel Tao who provided
> +Though this work is inspired by many different people, this work benefited 
> especially from exchanges with
> +Michael Eisel, who motivated the original research with his key insights, 
> and with Nigel Tao who provided
>  invaluable feedback. Rémy Oudompheng first implemented a fast path we use in 
> the case of long digits.
>
> -The library includes code adapted from Google Wuffs (written by Nigel Tao) 
> which was originally published
> +The library includes code adapted from Google Wuffs (written by Nigel Tao) 
> which was originally published
>  under the Apache 2.0 license.
>
>  ## License
> --- libstdc++-v3/src/c++17/fast_float/fast_float.h.jj   2022-02-04 
> 14:36:56.966577910 +0100
> +++ libstdc++-v3/src/c++17/fast_float/fast_float.h      2022-11-05 
> 18:54:48.096049177 +0100
> @@ -74,7 +74,7 @@ struct parse_options {
>   * Like the C++17 standard, the `fast_float::from_chars` functions take an 
> optional last argument of
>   * the type `fast_float::chars_format`. It is a bitset value: we check 
> whether
>   * `fmt & fast_float::chars_format::fixed` and `fmt & 
> fast_float::chars_format::scientific` are set
> - * to determine whether we allowe the fixed point and scientific notation 
> respectively.
> + * to determine whether we allow the fixed point and scientific notation 
> respectively.
>   * The default is  `fast_float::chars_format::general` which allows both 
> `fixed` and `scientific`.
>   */
>  template<typename T>
> @@ -98,12 +98,11 @@ from_chars_result from_chars_advanced(co
>         || defined(__amd64) || defined(__aarch64__) || defined(_M_ARM64) \
>         || defined(__MINGW64__)                                          \
>         || defined(__s390x__)                                            \
> -       || (defined(__ppc64__) || defined(__PPC64__) || defined(__ppc64le__) 
> || defined(__PPC64LE__)) \
> -       || defined(__EMSCRIPTEN__))
> +       || (defined(__ppc64__) || defined(__PPC64__) || defined(__ppc64le__) 
> || defined(__PPC64LE__)) )
>  #define FASTFLOAT_64BIT
>  #elif (defined(__i386) || defined(__i386__) || defined(_M_IX86)   \
>       || defined(__arm__) || defined(_M_ARM)                   \
> -     || defined(__MINGW32__))
> +     || defined(__MINGW32__) || defined(__EMSCRIPTEN__))
>  #define FASTFLOAT_32BIT
>  #else
>    // Need to check incrementally, since SIZE_MAX is a size_t, avoid overflow.
> @@ -128,7 +127,7 @@ from_chars_result from_chars_advanced(co
>  #define FASTFLOAT_VISUAL_STUDIO 1
>  #endif
>
> -#ifdef __BYTE_ORDER__
> +#if defined __BYTE_ORDER__ && defined __ORDER_BIG_ENDIAN__
>  #define FASTFLOAT_IS_BIG_ENDIAN (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
>  #elif defined _WIN32
>  #define FASTFLOAT_IS_BIG_ENDIAN 0
> @@ -271,8 +270,9 @@ fastfloat_really_inline uint64_t _umul12
>  fastfloat_really_inline value128 full_multiplication(uint64_t a,
>                                                       uint64_t b) {
>    value128 answer;
> -#ifdef _M_ARM64
> +#if defined(_M_ARM64) && !defined(__MINGW32__)
>    // ARM64 has native support for 64-bit multiplications, no need to emulate
> +  // But MinGW on ARM64 doesn't have native support for 64-bit 
> multiplications
>    answer.high = __umulh(a, b);
>    answer.low = a * b;
>  #elif defined(FASTFLOAT_32BIT) || (defined(_WIN64) && !defined(__clang__))
> @@ -307,21 +307,69 @@ constexpr static double powers_of_ten_do
>      1e12, 1e13, 1e14, 1e15, 1e16, 1e17, 1e18, 1e19, 1e20, 1e21, 1e22};
>  constexpr static float powers_of_ten_float[] = {1e0, 1e1, 1e2, 1e3, 1e4, 1e5,
>                                                  1e6, 1e7, 1e8, 1e9, 1e10};
> +// used for max_mantissa_double and max_mantissa_float
> +constexpr uint64_t constant_55555 = 5 * 5 * 5 * 5 * 5;
> +// Largest integer value v so that (5**index * v) <= 1<<53.
> +// 0x10000000000000 == 1 << 53
> +constexpr static uint64_t max_mantissa_double[] = {
> +      0x10000000000000,
> +      0x10000000000000 / 5,
> +      0x10000000000000 / (5 * 5),
> +      0x10000000000000 / (5 * 5 * 5),
> +      0x10000000000000 / (5 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555),
> +      0x10000000000000 / (constant_55555 * 5),
> +      0x10000000000000 / (constant_55555 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * 5 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555),
> +      0x10000000000000 / (constant_55555 * constant_55555 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 
> 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 
> 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 
> 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 
> 5 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 
> constant_55555),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 
> constant_55555 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 
> constant_55555 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 
> constant_55555 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 
> constant_55555 * 5 * 5 * 5 * 5)};
> +  // Largest integer value v so that (5**index * v) <= 1<<24.
> +  // 0x1000000 == 1<<24
> +  constexpr static uint64_t max_mantissa_float[] = {
> +      0x1000000,
> +      0x1000000 / 5,
> +      0x1000000 / (5 * 5),
> +      0x1000000 / (5 * 5 * 5),
> +      0x1000000 / (5 * 5 * 5 * 5),
> +      0x1000000 / (constant_55555),
> +      0x1000000 / (constant_55555 * 5),
> +      0x1000000 / (constant_55555 * 5 * 5),
> +      0x1000000 / (constant_55555 * 5 * 5 * 5),
> +      0x1000000 / (constant_55555 * 5 * 5 * 5 * 5),
> +      0x1000000 / (constant_55555 * constant_55555),
> +      0x1000000 / (constant_55555 * constant_55555 * 5)};
>
>  template <typename T> struct binary_format {
> +  using equiv_uint = typename std::conditional<sizeof(T) == 4, uint32_t, 
> uint64_t>::type;
> +
>    static inline constexpr int mantissa_explicit_bits();
>    static inline constexpr int minimum_exponent();
>    static inline constexpr int infinite_power();
>    static inline constexpr int sign_index();
> -  static inline constexpr int min_exponent_fast_path();
>    static inline constexpr int max_exponent_fast_path();
>    static inline constexpr int max_exponent_round_to_even();
>    static inline constexpr int min_exponent_round_to_even();
> -  static inline constexpr uint64_t max_mantissa_fast_path();
> +  static inline constexpr uint64_t max_mantissa_fast_path(int64_t power);
>    static inline constexpr int largest_power_of_ten();
>    static inline constexpr int smallest_power_of_ten();
>    static inline constexpr T exact_power_of_ten(int64_t power);
>    static inline constexpr size_t max_digits();
> +  static inline constexpr equiv_uint exponent_mask();
> +  static inline constexpr equiv_uint mantissa_mask();
> +  static inline constexpr equiv_uint hidden_bit_mask();
>  };
>
>  template <> inline constexpr int 
> binary_format<double>::mantissa_explicit_bits() {
> @@ -364,21 +412,6 @@ template <> inline constexpr int binary_
>  template <> inline constexpr int binary_format<double>::sign_index() { 
> return 63; }
>  template <> inline constexpr int binary_format<float>::sign_index() { return 
> 31; }
>
> -template <> inline constexpr int 
> binary_format<double>::min_exponent_fast_path() {
> -#if (FLT_EVAL_METHOD != 1) && (FLT_EVAL_METHOD != 0)
> -  return 0;
> -#else
> -  return -22;
> -#endif
> -}
> -template <> inline constexpr int 
> binary_format<float>::min_exponent_fast_path() {
> -#if (FLT_EVAL_METHOD != 1) && (FLT_EVAL_METHOD != 0)
> -  return 0;
> -#else
> -  return -10;
> -#endif
> -}
> -
>  template <> inline constexpr int 
> binary_format<double>::max_exponent_fast_path() {
>    return 22;
>  }
> @@ -386,11 +419,17 @@ template <> inline constexpr int binary_
>    return 10;
>  }
>
> -template <> inline constexpr uint64_t 
> binary_format<double>::max_mantissa_fast_path() {
> -  return uint64_t(2) << mantissa_explicit_bits();
> +template <> inline constexpr uint64_t 
> binary_format<double>::max_mantissa_fast_path(int64_t power) {
> +  // caller is responsible to ensure that
> +  // power >= 0 && power <= 22
> +  //
> +  return max_mantissa_double[power];
>  }
> -template <> inline constexpr uint64_t 
> binary_format<float>::max_mantissa_fast_path() {
> -  return uint64_t(2) << mantissa_explicit_bits();
> +template <> inline constexpr uint64_t 
> binary_format<float>::max_mantissa_fast_path(int64_t power) {
> +  // caller is responsible to ensure that
> +  // power >= 0 && power <= 10
> +  //
> +  return max_mantissa_float[power];
>  }
>
>  template <>
> @@ -429,6 +468,33 @@ template <> inline constexpr size_t bina
>    return 114;
>  }
>
> +template <> inline constexpr binary_format<float>::equiv_uint
> +    binary_format<float>::exponent_mask() {
> +  return 0x7F800000;
> +}
> +template <> inline constexpr binary_format<double>::equiv_uint
> +    binary_format<double>::exponent_mask() {
> +  return 0x7FF0000000000000;
> +}
> +
> +template <> inline constexpr binary_format<float>::equiv_uint
> +    binary_format<float>::mantissa_mask() {
> +  return 0x007FFFFF;
> +}
> +template <> inline constexpr binary_format<double>::equiv_uint
> +    binary_format<double>::mantissa_mask() {
> +  return 0x000FFFFFFFFFFFFF;
> +}
> +
> +template <> inline constexpr binary_format<float>::equiv_uint
> +    binary_format<float>::hidden_bit_mask() {
> +  return 0x00800000;
> +}
> +template <> inline constexpr binary_format<double>::equiv_uint
> +    binary_format<double>::hidden_bit_mask() {
> +  return 0x0010000000000000;
> +}
> +
>  template<typename T>
>  fastfloat_really_inline void to_float(bool negative, adjusted_mantissa am, T 
> &value) {
>    uint64_t word = am.mantissa;
> @@ -2410,40 +2476,24 @@ fastfloat_really_inline int32_t scientif
>  // this converts a native floating-point number to an extended-precision 
> float.
>  template <typename T>
>  fastfloat_really_inline adjusted_mantissa to_extended(T value) noexcept {
> +  using equiv_uint = typename binary_format<T>::equiv_uint;
> +  constexpr equiv_uint exponent_mask = binary_format<T>::exponent_mask();
> +  constexpr equiv_uint mantissa_mask = binary_format<T>::mantissa_mask();
> +  constexpr equiv_uint hidden_bit_mask = binary_format<T>::hidden_bit_mask();
> +
>    adjusted_mantissa am;
>    int32_t bias = binary_format<T>::mantissa_explicit_bits() - 
> binary_format<T>::minimum_exponent();
> -  if (std::is_same<T, float>::value) {
> -    constexpr uint32_t exponent_mask = 0x7F800000;
> -    constexpr uint32_t mantissa_mask = 0x007FFFFF;
> -    constexpr uint64_t hidden_bit_mask = 0x00800000;
> -    uint32_t bits;
> -    ::memcpy(&bits, &value, sizeof(T));
> -    if ((bits & exponent_mask) == 0) {
> -      // denormal
> -      am.power2 = 1 - bias;
> -      am.mantissa = bits & mantissa_mask;
> -    } else {
> -      // normal
> -      am.power2 = int32_t((bits & exponent_mask) >> 
> binary_format<T>::mantissa_explicit_bits());
> -      am.power2 -= bias;
> -      am.mantissa = (bits & mantissa_mask) | hidden_bit_mask;
> -    }
> +  equiv_uint bits;
> +  ::memcpy(&bits, &value, sizeof(T));
> +  if ((bits & exponent_mask) == 0) {
> +    // denormal
> +    am.power2 = 1 - bias;
> +    am.mantissa = bits & mantissa_mask;
>    } else {
> -    constexpr uint64_t exponent_mask = 0x7FF0000000000000;
> -    constexpr uint64_t mantissa_mask = 0x000FFFFFFFFFFFFF;
> -    constexpr uint64_t hidden_bit_mask = 0x0010000000000000;
> -    uint64_t bits;
> -    ::memcpy(&bits, &value, sizeof(T));
> -    if ((bits & exponent_mask) == 0) {
> -      // denormal
> -      am.power2 = 1 - bias;
> -      am.mantissa = bits & mantissa_mask;
> -    } else {
> -      // normal
> -      am.power2 = int32_t((bits & exponent_mask) >> 
> binary_format<T>::mantissa_explicit_bits());
> -      am.power2 -= bias;
> -      am.mantissa = (bits & mantissa_mask) | hidden_bit_mask;
> -    }
> +    // normal
> +    am.power2 = int32_t((bits & exponent_mask) >> 
> binary_format<T>::mantissa_explicit_bits());
> +    am.power2 -= bias;
> +    am.mantissa = (bits & mantissa_mask) | hidden_bit_mask;
>    }
>
>    return am;
> @@ -2869,11 +2919,10 @@ from_chars_result from_chars_advanced(co
>    }
>    answer.ec = std::errc(); // be optimistic
>    answer.ptr = pns.lastmatch;
> -  // Next is Clinger's fast path.
> -  if (binary_format<T>::min_exponent_fast_path() <= pns.exponent && 
> pns.exponent <= binary_format<T>::max_exponent_fast_path() && pns.mantissa 
> <=binary_format<T>::max_mantissa_fast_path() && !pns.too_many_digits) {
> +  // Next is a modified Clinger's fast path, inspired by Jakub Jelínek's 
> proposal
> +  if (pns.exponent >= 0 && pns.exponent <= 
> binary_format<T>::max_exponent_fast_path() && pns.mantissa 
> <=binary_format<T>::max_mantissa_fast_path(pns.exponent) && 
> !pns.too_many_digits) {
>      value = T(pns.mantissa);
> -    if (pns.exponent < 0) { value = value / 
> binary_format<T>::exact_power_of_ten(-pns.exponent); }
> -    else { value = value * 
> binary_format<T>::exact_power_of_ten(pns.exponent); }
> +    value = value * binary_format<T>::exact_power_of_ten(pns.exponent);
>      if (pns.negative) { value = -value; }
>      return answer;
>    }
> --- libstdc++-v3/testsuite/20_util/from_chars/pr107468.cc.jj    2022-11-05 
> 19:20:28.944898668 +0100
> +++ libstdc++-v3/testsuite/20_util/from_chars/pr107468.cc       2022-11-05 
> 19:26:44.318740322 +0100
> @@ -0,0 +1,42 @@
> +// Copyright (C) 2022 Free Software Foundation, Inc.
> +//
> +// This file is part of the GNU ISO C++ Library.  This library is free
> +// software; you can redistribute it and/or modify it under the
> +// terms of the GNU General Public License as published by the
> +// Free Software Foundation; either version 3, or (at your option)
> +// any later version.
> +
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +// GNU General Public License for more details.
> +
> +// You should have received a copy of the GNU General Public License along
> +// with this library; see the file COPYING3.  If not see
> +// <http://www.gnu.org/licenses/>.
> +
> +// { dg-do run { target c++17 } }
> +// { dg-add-options ieee }
> +
> +#include <charconv>
> +#include <string>
> +#include <cfenv>
> +#include <testsuite_hooks.h>
> +
> +int
> +main()
> +{
> +  // FP from_char not available otherwise.
> +#if __cpp_lib_to_chars >= 201611L \
> +    && _GLIBCXX_USE_C99_FENV_TR1 \
> +    && defined(FE_DOWNWARD) \
> +    && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
> +  // PR libstdc++/107468
> +  float f;
> +  char buf[] = "3.355447e+07";
> +  std::fesetround(FE_DOWNWARD);
> +  auto [ptr, ec] = std::from_chars(buf, buf + sizeof(buf) - 1, f, 
> std::chars_format::scientific);
> +  VERIFY( ec == std::errc() && ptr == buf + sizeof(buf) - 1 );
> +  VERIFY( f == 33554472.0f );
> +#endif
> +}
>
>         Jakub
>

Re: [PATCH] libstdc++: Update from latest fast_float [PR107468]

Reply via email to