Paul Eggert wrote: > > More precisely, one of the string arguments must be an ASCII string; > > the other one can also contain non-ASCII characters (but then the > > comparison result will be nonzero). > > Why is this restriction needed?
It is needed to guarantee that the result is equivalent to the comparison result in the C locale. On a system where the C locale has UTF-8 encoding, c_strcasecmp ("François", "FRANÇOIS") != 0 although setlocale (LC_ALL, "C"); strcasecmp ("François", "FRANÇOIS") == 0. > Doesn't the code simply > compare bytes after converting 'A'-'Z' to 'a'-'z'? In that case, > it is not really required that one argument must be an ASCII string; > both strings can be non-ASCII but the result is still well-defined. The result is then well-defined but not related to the behaviour of the C locale on such systems, and the name of the module would be a misnomer :-) > > return c1 - c2; > > A nit: in theory this could result in integer overflow. > The following would be portable to machines where char == int. > > return UCHAR_MAX <= INT_MAX ? c1 - c2 : c1 < c2 ? -1 : c1 > c2; > > Such machines do exist. They are unlikely targets for big GNU > apps but are potential targets for this module. OK, fixed. But just for info, what are these machines? The 10-year old CRAY ? Bruno 2005-10-11 Bruno Haible <[EMAIL PROTECTED]> * strcasecmp.c: Include limits.h. (strcasecmp): Avoid integer overflow on exotic platforms. * strncasecmp.c: Include limits.h. (strncasecmp): Avoid integer overflow on exotic platforms. Reported by Paul Eggert. diff -c -3 -r1.10 strcasecmp.c *** strcasecmp.c 17 Aug 2005 14:01:07 -0000 1.10 --- strcasecmp.c 11 Oct 2005 12:47:19 -0000 *************** *** 25,30 **** --- 25,31 ---- #include "strcase.h" #include <ctype.h> + #include <limits.h> #if HAVE_MBRTOWC # include "mbuiter.h" *************** *** 93,98 **** } while (c1 == c2); ! return c1 - c2; } } --- 94,105 ---- } while (c1 == c2); ! if (UCHAR_MAX <= INT_MAX) ! return c1 - c2; ! else ! /* On machines where 'char' and 'int' are types of the same size, the ! difference of two 'unsigned char' values - including the sign bit - ! doesn't fit in an 'int'. */ ! return (c1 > c2 ? 1 : c1 < c2 ? -1 : 0); } } diff -c -3 -r1.6 strncasecmp.c *** strncasecmp.c 19 Sep 2005 17:28:15 -0000 1.6 --- strncasecmp.c 11 Oct 2005 12:47:19 -0000 *************** *** 1,5 **** /* strncasecmp.c -- case insensitive string comparator ! Copyright (C) 1998, 1999 Free Software Foundation, Inc. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by --- 1,5 ---- /* strncasecmp.c -- case insensitive string comparator ! Copyright (C) 1998, 1999, 2005 Free Software Foundation, Inc. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by *************** *** 23,28 **** --- 23,29 ---- #include "strcase.h" #include <ctype.h> + #include <limits.h> #define TOLOWER(Ch) (isupper (Ch) ? tolower (Ch) : (Ch)) *************** *** 54,58 **** } while (c1 == c2); ! return c1 - c2; } --- 55,65 ---- } while (c1 == c2); ! if (UCHAR_MAX <= INT_MAX) ! return c1 - c2; ! else ! /* On machines where 'char' and 'int' are types of the same size, the ! difference of two 'unsigned char' values - including the sign bit - ! doesn't fit in an 'int'. */ ! return (c1 > c2 ? 1 : c1 < c2 ? -1 : 0); } _______________________________________________ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib