On 03/08/20 13:47 +0200, Florian Weimer wrote:
* Jonathan Wakely:
What seems to be missing is a function that takes an explicit buffer
length. A static reference to the C locale object would be helpful as
well, I assume.
How expensive is it to do newlocale("C", nullptr), uselocale, and
freelocale?
freelocale does nothing in this situation. newlocale has a bunch of
conditional branches to detect this situation. uselocale has fewer
branches, but has more memory accesses.
At least there's no locking involved. (But I believe uselocale is
thread-unsafe regarding setlocale in the glibc implementation or
something like that.)
Maybe this is sufficiently clean that we can export this for libstdc++'s
use? Without repeating the libio mess?
I think we could beat strtod's performance with a handwritten
implementation, so I don't know if it's worth adding glibc extensions
if we would stop using them eventually anyway.
I was reminded of this query:
Robustly parse string to unsigned integer (strtoul?)/docs
<https://sourceware.org/pipermail/libc-help/2020-June/005337.html>
It suggests to me that we need better interfaces.
Yes, the ISO C interfaces for this are poor.
Their error reporting is tricky. You have to check a combination of
the return value, the endptr argument, and errno.
https://movementarian.org/blog/posts/2009-03-14-its-not-just-atol-nicholas/
They are too general. strtod is a single function that parses three
different number formats, with no way to restrict it only one of those
formats.
They require null-terminated strings, making them unsuitable for
certain uses e.g. a byte sequence containing several Pascal-style
length-prefixed strings like "\x05123.4\x351".
They're locale-dependent, which is not useful for machine-readable I/O
where the format of numbers must not be localized e.g. JSON.
And strtoul accepts negative numbers (srsly?!)
C++ has decided to ignore these C functions and has defined better
ones. The fact that libstdc++ currently uses strtod internally is a
temporary kluge.
Replicating the C++ API in glibc might be useful for C devs, but to be
useful for libstdc++ we'd need to copy the code into libstdc++ itself
so we could use it for targets using musl, newlib, BSD, mingw etc.
I also noticed some strings give an underflow error with glibc's
strtod, but are valid for the Microsoft implementation. For example,
this one:
https://github.com/microsoft/STL/blob/master/tests/std/tests/P0067R5_charconv/double_from_chars_test_cases.hpp#L265
Without the final '1' digit glibc returns DBL_MIN, but with the final
'1' digit (so a number larger than DBL_MIN) it underflows. Is that
expected?
I don't know. I think you should bring this up on libc-alpha.
Joseph answered this part, clarifying that glibc is correct to report
underflow.