Bruno Haible <[EMAIL PROTECTED]> writes: > Therefore most of our "c-*" modules should better be called > "ascii-*" or "unibyte-*".
But both ASCII and other unibyte locales might say that some bytes are encoding errors. So none of these names are exactly right. I guess c-* is as good a name as any. >> I think this claim isn't true for some weird non-ASCII encoding >> schemes like DBCS-Host. > > Are these used as locale encodings? Many of these so-called DBCS encodings > are stateful and therefore not usable as locale encodings. Some are stateful, some not. As I understand it, the former are more common, but I have practical experience only with the latter. They are used as locale encodings in C environments. I'd expect Cobol to be similar but don't know about it. > Non-nearly-ASCII-compatible encodings don't appear in the world where GNU > programs are deployed. This is true for GNU programs that deal with encodings. My guess is that most people who use GNU software use --disable-nls and the like when they run in non-ASCII environments, and don't bother to file bug reports because they don't expect much help from us. That being said, GNU make and GCC are used on OS/390, as well as Python and Perl. People have ported other GNU tools like M4. (Admittedly it is an uphill battle...) > But it's important to know that c_strstr (s, "x") is not safe and > c_strstr (s, "123") is also not safe. The programmer needs to have the > precise criteria. I don't quite follow this. c_strstr (S, "x") is safe in all cases; it never has undefined behavior. It's true that the result might not be the same as strstr (S, "x"), but that's the point of having c_strstr, right? So I would change this: > /* The functions defined in this file assume a nearly ASCII compatible > character set. */ to /* The functions defined in this file act on null-terminated byte strings, without regard to locale. */ and this: > This function is safe to be called, even in a multibyte locale, if NEEDLE > ... to this: > This function is safe to be called, even in all known multibyte locales > derived from ASCII, if NEEDLE ...