On 03/08/20 13:47 +0200, Florian Weimer wrote:
* Jonathan Wakely:

What seems to be missing is a function that takes an explicit buffer
length.  A static reference to the C locale object would be helpful as
well, I assume.

How expensive is it to do newlocale("C", nullptr), uselocale, and
freelocale?

freelocale does nothing in this situation.  newlocale has a bunch of
conditional branches to detect this situation.  uselocale has fewer
branches, but has more memory accesses.

At least there's no locking involved.  (But I believe uselocale is
thread-unsafe regarding setlocale in the glibc implementation or
something like that.)

Maybe this is sufficiently clean that we can export this for libstdc++'s
use?  Without repeating the libio mess?

I think we could beat strtod's performance with a handwritten
implementation, so I don't know if it's worth adding glibc extensions
if we would stop using them eventually anyway.

I was reminded of this query:

 Robustly parse string to unsigned integer (strtoul?)/docs
 <https://sourceware.org/pipermail/libc-help/2020-June/005337.html>

It suggests to me that we need better interfaces.

Yes, the ISO C interfaces for this are poor.

Their error reporting is tricky. You have to check a combination of
the return value, the endptr argument, and errno.
https://movementarian.org/blog/posts/2009-03-14-its-not-just-atol-nicholas/

They are too general. strtod is a single function that parses three
different number formats, with no way to restrict it only one of those
formats.

They require null-terminated strings, making them unsuitable for
certain uses e.g. a byte sequence containing several Pascal-style
length-prefixed strings like "\x05123.4\x351".

They're locale-dependent, which is not useful for machine-readable I/O
where the format of numbers must not be localized e.g. JSON.

And strtoul accepts negative numbers (srsly?!)

C++ has decided to ignore these C functions and has defined better
ones. The fact that libstdc++ currently uses strtod internally is a
temporary kluge.

Replicating the C++ API in glibc might be useful for C devs, but to be
useful for libstdc++ we'd need to copy the code into libstdc++ itself
so we could use it for targets using musl, newlib, BSD, mingw etc.

I also noticed some strings give an underflow error with glibc's
strtod, but are valid for the Microsoft implementation. For example,
this one:
https://github.com/microsoft/STL/blob/master/tests/std/tests/P0067R5_charconv/double_from_chars_test_cases.hpp#L265

Without the final '1' digit glibc returns DBL_MIN, but with the final
'1' digit (so a number larger than DBL_MIN) it underflows. Is that
expected?

I don't know.  I think you should bring this up on libc-alpha.

Joseph answered this part, clarifying that glibc is correct to report
underflow.


Reply via email to