Re: gnulib's Knuth-Morris-Pratt implementation (was: memmem issues)

James Youngman Thu, 20 Dec 2007 05:30:43 -0800

On Dec 20, 2007 1:27 PM, Eric Blake <[EMAIL PROTECTED]> wrote:
> Also, I wonder if Boyer-Moore would be any more effective than KMP in the
> average case (both have worst-case linear complexity, but we know that in
> general, we aren't always dealing with worst-case).
>
> http://en.wikipedia.org/wiki/Knuth-morris-pratt
> http://en.wikipedia.org/wiki/Boyer-Moore_string_search_algorithm


Quite possibly.   If we extracted the KMP implementation as a
prepare-then-search interface, and as I mentioned, I've done quite a
bit of work in that area already, we could do this dynamically (or at
least, without necessarily impacting the caller).

I'm sure Bruno thought about this already, though.   The main problem
though is that the B-M algorithm (I guess we'd likely select
Turbo-Boyer-Moore to reduce the comparisons from 3N to 2N) needs to
move backward through the haystack.  That doesn't work well on
multibyte strings.   We could, as I mentioned, build an ancilliary
data structure, but that would mean adding an extra N operations (thus
making the cost ~3N once again, I suppose).

James.

Re: gnulib's Knuth-Morris-Pratt implementation (was: memmem issues)

Reply via email to