On 21/07/20 07:56 +0200, Florian Weimer wrote:
* Jonathan Wakely via Libstdc:
By replacing the use of strtod we could avoid allocation, avoid changing
locale, and use optimised code paths specific to each std::chars_format
case. We would also get more portable behaviour, rather than depending
on the presence of uselocale, and on any bugs or quirks of the target
libc's strtod. Replacing strtod is a project for a later date.
glibc already has strtod_l (since glibc 2.1, undocumented, but declared
in <stdlib.h>).
Yes, I noticed that in the glibc sources. I decided not to bother
using it because we still need the newlocale and freelocale calls,
which can still potentially allocate memory (although in practice
maybe they don't for the "C" locale?) and because what I committed
should work for any POSIX target.
What seems to be missing is a function that takes an explicit buffer
length. A static reference to the C locale object would be helpful as
well, I assume.
How expensive is it to do newlocale("C", nullptr), uselocale, and
freelocale?
Maybe this is sufficiently clean that we can export this for libstdc++'s
use? Without repeating the libio mess?
I think we could beat strtod's performance with a handwritten
implementation, so I don't know if it's worth adding glibc extensions
if we would stop using them eventually anyway.
std::from_chars takes an enum that says whether the input is in
hex, scientific or fixed format (or 'general' which is
fixed|scientific). Because strtod determines the format itself, we
need to do some preprocessing before calling strtod, to stop it being
too general.
Some examples where strtod does the wrong thing unless we do extra
work before calling it:
"0x1p01" should always produce the result 0, for any format (because
you pass the hex flag to std::from_chars, it doesn't need a "0x"
prefix, and if one is present it's interpreted as simply "0"). If we
don't truncate the string, strtod produces 2.
"0.8p1" should produce 0.8 for fixed and general formats, produce an
error for scientific format, and produce 1 for hex format (which means
we need to create the string "0x0.8p1" to pass to strtod). strtod
always produces 0.8 for this input.
I also noticed some strings give an underflow error with glibc's
strtod, but are valid for the Microsoft implementation. For example,
this one:
https://github.com/microsoft/STL/blob/master/tests/std/tests/P0067R5_charconv/double_from_chars_test_cases.hpp#L265
Without the final '1' digit glibc returns DBL_MIN, but with the final
'1' digit (so a number larger than DBL_MIN) it underflows. Is that
expected?