On Aug 28, 2005, at 7:25 PM, greenrd at greenrd dot org wrote:


------- Additional Comments From greenrd at greenrd dot org 2005-08-28 23:25 ------- memcmp (which is compiled for i686 in fedora because it is part of glibc) is actually less efficient than the current code on my athlon! I was so surprised, I ran the memcmp benchmark again, and the results differed by no more than +/-2%.

Here are the wallclock times in ms, followed by the advantage of block compare
over the current code. n is the length of the strings tested.

n   | Current | block compare | memcmp | Advantage of block compare
-------------------------------------------------------------------
10  | 10717   | 9236          | 11957  | 16%
30  | 16427   | 14618         | 19884  | 12%
50  | 22181   | 17539         | 27550  | 26%
70  | 28052   | 20978         | 35243  | 34%
90  | 32966   | 24695         | 42815  | 33%
110 | 42975   | 28453         | 55036  | 51%

All these tests were done on x86 with the same -O, -g and -f flags as make bootstrap uses by default, using LD_PRELOAD to "hot-replace" the code, and
without the assertion enabled in the benchmark.

This seems like something glibc's memcmp should be doing also, could
you report a bug to glibc about this comparison?  Also glibc's memcmp
could be improved by doing 128 byte (SSE2 and altivec) comparison
at a time so we get a nice speed up there too.

-- Pinski

Reply via email to