On 07/07/10 23:49, Pádraig Brady wrote: > On 07/07/10 17:07, Simon Josefsson wrote: >> Pádraig Brady <p...@draigbrady.com> writes: >> >>> + /* The following is equivalent to: >>> + return memmem (s, strlen(s), c, csize); >>> + but faster for long S with matching UC near the start, >>> + and also memmem is sometimes buggy and inefficient. */ >>> switch (u8_uctomb_aux (c, uc, 6)) >> >> Don't we have an efficient memmem in gnulib that this code could use? > > Well the current replacement isn't great for small needles. > Also since we don't know the length of the string up front > we would also waste a bit of time on the strlen().
Well actually testing it, shows that memmem() isn't that bad at all, though it is slower than the current u8_strchr() for the interesting 2 and 3 character cases function hay_len needle_len iterations time ------------------------------------------------------------- gl_memmem long 1 1 2.17 pb_memmem¹ long 1 1 3.45 u8_strchr long 1 1 3.28 gl_memmem 60 1 10000 2.18 pb_memmem 60 1 10000 1.99 u8_strchr 60 1 10000 2.17 gl_memmem long 2 1 3.88 pb_memmem long 2 1 4.67 u8_strchr long 2 1 3.47 gl_memmem 60 2 10000 1.97 pb_memmem 60 2 10000 1.97 u8_strchr 60 2 10000 1.96 gl_memmem long 3 1 5.86 pb_memmem long 3 1 4.02 u8_strchr long 3 1 4.28 gl_memmem 60 3 10000 1.97 pb_memmem 60 3 10000 1.97 u8_strchr 60 3 10000 1.98 The above tests were compiled with: gcc -fno-builtin-strlen -march=pentium-m -O3 test-memmem.c ¹ My very simple implementation of memmem() just for comparison char* pb_memmem (const char* hay, size_t hl, const char* needle, size_t nl) { const char* nohay = hay + hl - nl; const char* noneedle = needle + nl; if (nl > hl) return NULL; for (;;) { const char* p1 = hay; const char* p2 = needle; do if (p2 == noneedle) return (char*) hay; while (*p1++ == *p2++); hay++; if (hay > nohay) return NULL; } } cheers, Pádraig.