> > +TESTS += test-memmem
> > +TESTS_ENVIRONMENT += EXEEXT='@EXEEXT@'
> > +check_PROGRAMS += test-memmem
>
> You can drop the TESTS_ENVIRONMENT line.
I've dropped this line now. TESTS_ENVIRONMENT setting EXEEXT is only needed
when one of the tests is a shell script that calls an executable.
2008
Eric Blake wrote:
> According to Bruno Haible on 12/21/2007 6:42 AM:
> | does not define NULL. Better write (void*)0 or "" instead of
> NULL.
>
> Which platforms have a broken that fails to define NULL?
Actually all platforms I have access to today define NULL in .
Probably the latest platform
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
According to Bruno Haible on 12/21/2007 6:42 AM:
|
|> + [AC_RUN_IFELSE([AC_LANG_PROGRAM([#include ],
|> + [return !memmem ("a", 1, NULL, 0);])],
|
| does not define NULL. Better write (void*)0 or "" instead of
NULL.
I've since fixed thi
Paul Eggert wrote:
> Yes, thanks, that looks good.
Committed.
Bruno
"Bruno Haible" <[EMAIL PROTECTED]> writes:
> * lib/memmem.c (knuth_morris_pratt, memmem): Change all 'char *'
> variables to 'unsigned char *' type.
> Reported by Paul Eggert.
Yes, thanks, that looks good.
> Thanks, but the email that I got lacked a patch.
Sorry about it. Here it is (please forgive the line breaking caused by the
webmailer):
2008-01-02 Bruno Haible <[EMAIL PROTECTED]>
* lib/memmem.c (knuth_morris_pratt, memmem): Change all 'char *'
variables to 'unsigned char *
"Bruno Haible" <[EMAIL PROTECTED]> wrote:
> Paul Eggert wrote:
>> section 6.2.6.1:
>> ...
>> Here again, the intent is that memcmp(&x, &y, sizeof (T)) compares all
>> the bits in the underlying representation. The only way to do that
>> portably is via "unsigned char *".
>
> OK. Here's a proposed
Paul Eggert wrote:
> section 6.2.6.1:
> ...
> Here again, the intent is that memcmp(&x, &y, sizeof (T)) compares all
> the bits in the underlying representation. The only way to do that
> portably is via "unsigned char *".
OK. Here's a proposed patch for the memmem function.
Bruno
Bruno Haible <[EMAIL PROTECTED]> writes:
> It says that the _values_ of the characters are interpreted as
> unsigned char. I.e. like this:
Sure, but a character, as used in the sense here, is a bit
representation that fits in a byte (see section 3.7.1); it is not a
'char' value. The intent is th
Paul Eggert wrote:
> Bruno Haible <[EMAIL PROTECTED]> writes:
>
> > Most of your comments apply to all copies of the KMP code in gnulib.
>
> Ouch! Should these be coalesced?
Out of the 8 copies of the code currently in gnulib, I can easily coalesce
5 into 1; see attached patch. The remaining 4
Paul Eggert wrote:
> OK, I installed this patch.
>
> 2007-12-29 Paul Eggert <[EMAIL PROTECTED]>
>
> * lib/memmem.c (knuth_morris_pratt): Check for size_t overflow
> when multiplying M by sizeof (size_t).
Thanks. Let me generalize it, as follows.
2007-12-30 Bruno Haible <[EMAIL
Paul Eggert wrote:
> The 'char' type can have padding bits, which means that code like this:
>
> char const *needle = (char const *) needle_start;
> char const *haystack = (char const *) haystack_start;
> if ((unsigned char) *needle == (unsigned char) *haystack) ...
>
> might ignor
Ben Pfaff wrote:
> >> > +unsigned char b = (unsigned char) needle[i - 1];
> >> > ...
> >> > +if (b == (unsigned char) needle[j])
> >>
> >> Would it be cleaner to declare 'b' to be of type 'char' and avoid the
> >> casts?
> >
> > No; ISO C 99 section 7.21.4 says that when byte st
Bruno Haible <[EMAIL PROTECTED]> writes:
> No; ISO C 99 section 7.21.4 says that when byte strings are compared the
> elements are considered as 'unsigned char' values. Why risk bugs when the
> approach is very simple: after fetching any 'char' from any of the strings,
> cast it to 'unsigned char'
Bruno Haible <[EMAIL PROTECTED]> writes:
> Most of your comments apply to all copies of the KMP code in gnulib.
Ouch! Should these be coalesced?
>> Shouldn't this check for overflow in the multiplication?
>
> Yes, it should. I'm always lazy about this.
OK, I installed this patch.
2007-12-29
>> > + unsigned char b = (unsigned char) needle[i - 1];
>> > ...
>> > + if (b == (unsigned char) needle[j])
>>
>> Would it be cleaner to declare 'b' to be of type 'char' and avoid the
>> casts?
>
> No; ISO C 99 section 7.21.4 says that when byte strings are compared the
> elements are conside
Hi Paul,
Most of your comments apply to all copies of the KMP code in gnulib.
> Eric Blake <[EMAIL PROTECTED]> writes:
> > + size_t *table = (size_t *) malloca (m * sizeof (size_t));
> > + if (table == NULL)
> > +return false;
>
> Shouldn't this check for overflow in the multiplication?
Ye
Eric Blake wrote:
> + Fix memmem to avoid O(n^2) worst-case complexity.
> + * lib/memmem.c (knuth_morris_pratt): New function.
> + (memmem): Use it if first few naive iterations fail.
While considering to submit a modified version of this code to glibc
for inclusion, I found it good to
Eric Blake wrote:
> + * tests/test-memmem.c: Rewrite, borrowing ideas from
> + test-mbsstr1.c; the old version wouldn't even compile!
After the rewrite, there's no code left from the original test. I committed
this, to correct the attribution:
2007-12-23 Bruno Haible <[EMAIL PROTECTED]
Eric Blake <[EMAIL PROTECTED]> writes:
> + size_t *table = (size_t *) malloca (m * sizeof (size_t));
> + if (table == NULL)
> +return false;
Shouldn't this check for overflow in the multiplication?
Something like this, perhaps?
size_t *table;
if (xalloc_oversized (m, sizeof *table))
I wrote:
> > + precisely, when
> > + - the outer loop count is >= 10,
> > + - the average number of comparisons per outer loop is >= 5,
> > + - the total number of comparisons is >= m.
>
> These comments are out of sync with the code.
Oops, that was nonsense. Please ignore t
Eric Blake wrote:
> - It uses memcmp without depending on the memcmp module, making it needlessly
> fail on some older platforms.
If memory serves me well, the platforms where memcmp was incorrect (comparing
signed instead of unsigned byte values) were SunOS 4 and possibly IRIX 4.
*Very* ancient.
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
According to Jim Meyering on 12/20/2007 12:33 AM:
> Eric Blake <[EMAIL PROTECTED]> wrote:
> ...
>> - The implementation was naively quadratic in the worst-case complexity. I
>> lifted Bruno's KMP ideas in mbsstr to make it linear+delayed allocation.
On Dec 20, 2007 1:27 PM, Eric Blake <[EMAIL PROTECTED]> wrote:
> Also, I wonder if Boyer-Moore would be any more effective than KMP in the
> average case (both have worst-case linear complexity, but we know that in
> general, we aren't always dealing with worst-case).
>
> http://en.wikipedia.org/wi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
According to James Youngman on 12/20/2007 3:59 AM:
> I've been fiddling around with Bruno's KMP code recently. I like it a
> lot. I wish though there was some way for the caller to indicate
> that they are willing to amortise the cost of the KMP ini
On Dec 19, 2007 11:15 PM, Eric Blake <[EMAIL PROTECTED]> wrote:
> - The implementation was naively quadratic in the worst-case complexity. I
> lifted Bruno's KMP ideas in mbsstr to make it linear+delayed allocation.
> glibc
> still uses the naive implementation - should we file a bug with them
Eric Blake <[EMAIL PROTECTED]> writes:
> Several issues with memmem:
>
> - It uses memcmp without depending on the memcmp module, making it needlessly
> fail on some older platforms. But memcmp is currently licensed LGPL; is it
> okay with everyone that this patch relicenses it as LGPLv2+? (Or
Eric Blake <[EMAIL PROTECTED]> wrote:
...
> - The implementation was naively quadratic in the worst-case complexity. I
> lifted Bruno's KMP ideas in mbsstr to make it linear+delayed allocation.
> glibc
> still uses the naive implementation - should we file a bug with them to fix
> it?
This is t
Eric Blake <[EMAIL PROTECTED]> wrote:
> I'm trying to convert m4 to transparently handle embedded NUL. In the
> process,
> I need to move from strstr to memmem (if m4 were fully i18n already, I would
> instead be moving from mbsstr to the length-based analog of mbsstr that
> tolerates embedded NU
I'm trying to convert m4 to transparently handle embedded NUL. In the process,
I need to move from strstr to memmem (if m4 were fully i18n already, I would
instead be moving from mbsstr to the length-based analog of mbsstr that
tolerates embedded NUL - is there such a thing?)
Several issues wi
30 matches
Mail list logo