On Sun, Jan 18, 2009 at 03:42:29 +0100, Bruno Haible wrote:
> The module 'c-strtod' is under GPL, not LGPL. This alone is enough of a
> hint that the function is not really usable in libraries.

I don't agree that this is a good enough hint, since some libraries are
licensed under the GPL (including GNU PDF, which I was referring to).

> You can submit a patch for the function description.

Attached.

> In the long run,
> however, if this function should be usable in libraries, I see two
> possible approaches:
>   - Write a correct strtod() implementation - I mean, without rounding
>     errors -, and add a 'char decimal_point' parameter.

I'm not sure this is possible. According to the paper "How to Read
Floating Point Numbers Accurately" by William D Clinger, it can't be
done using fixed precision arithmetic; this implies that memory would
need to be allocated dynamically, but I don't think strtod is allowed to
fail with ENOMEM.

What would be the benefit of allowing decimal_point to be specified,
without supporting multi-byte decimal points or any other locale-
specific behaviour?

>   - Copy the argument string, preprocessing it by replacing the '.'
>     with the locale dependent decimal-point character, and pass that
>     to strtod().

Again, this wouldn't be an exact replacement for strtod since it could
fail with ENOMEM (or by aborting), but that's probably fine if it's
documented. Even the code path that calls strtod_l might fail with
ENOMEM if newlocale calls malloc (although glibc doesn't allocate memory
when accessing the "C" locale).

Apart from replacing '.', the code would have to filter characters that
wouldn't be accepted in the C locale, since other locales may interpret
them in an unexpected way. Replacing them with '\0' would probably work.

> > Finally, the documentation states c_strtod "operates as if the locale
> > encoding was ASCII", but the code doesn't actually convert the argument
> > from ASCII to the execution character set.
> 
> We assume that the execution character set is a superset of the part of
> ASCII that matters for numbers (digits, basic Latin letters, dot, comma,
> plus, minus signs).

I guess the intended meaning is that it won't accept non-ASCII
characters like locale-specific digits, but I think it's misleading to
refer to "ASCII encoding" in this case. I assumed it meant '.'==46,
'0'==48, etc.

I've changed the text to 'as if the locale was "C"' in the patched
version.

-- Michael
diff --git a/doc/c-strtod.texi b/doc/c-strtod.texi
index a2082e7..1ed8877 100644
--- a/doc/c-strtod.texi
+++ b/doc/c-strtod.texi
@@ -11,7 +11,7 @@
 
 The @code{c-strtod} module contains a string to number (@samp{double})
 conversion function operating on single-byte character strings, that operates
-as if the locale encoding was ASCII.
+as if the locale was "C".
 (The "C" locale on many systems has the locale encoding "ASCII".)
 
 The function is:
@@ -19,5 +19,10 @@ The function is:
 extern double c_strtod (const char *string, char **endp);
 @end smallexample
 
-In particular, only a period @samp{.} is accepted as decimal point, even
-when the current locale's notion of decimal point is a comma @samp{,}.
+In particular, only a period @samp{.} is accepted as decimal point (even
+when the current locale's notion of decimal point is a comma @samp{,}),
+and no characters outside the basic character set are accepted.
+
+This function aborts via xalloc_die if it can't allocate memory.
+On platforms without @code{strtod_l}, it isn't safe for use in
+multi-threaded applications since it calls @code{setlocale}.

Attachment: signature.asc
Description: Digital signature

Reply via email to