On Aug 28, 2005, at 7:25 PM, greenrd at greenrd dot org wrote:
------- Additional Comments From greenrd at greenrd dot org
2005-08-28 23:25 -------
memcmp (which is compiled for i686 in fedora because it is part of
glibc) is
actually less efficient than the current code on my athlon! I was so
surprised,
I ran the memcmp benchmark again, and the results differed by no more
than +/-2%.
Here are the wallclock times in ms, followed by the advantage of block
compare
over the current code. n is the length of the strings tested.
n | Current | block compare | memcmp | Advantage of block compare
-------------------------------------------------------------------
10 | 10717 | 9236 | 11957 | 16%
30 | 16427 | 14618 | 19884 | 12%
50 | 22181 | 17539 | 27550 | 26%
70 | 28052 | 20978 | 35243 | 34%
90 | 32966 | 24695 | 42815 | 33%
110 | 42975 | 28453 | 55036 | 51%
All these tests were done on x86 with the same -O, -g and -f flags as
make
bootstrap uses by default, using LD_PRELOAD to "hot-replace" the code,
and
without the assertion enabled in the benchmark.
This seems like something glibc's memcmp should be doing also, could
you report a bug to glibc about this comparison? Also glibc's memcmp
could be improved by doing 128 byte (SSE2 and altivec) comparison
at a time so we get a nice speed up there too.
-- Pinski