On 08/05/2010 01:44 AM, Simon Josefsson wrote:
Paul Eggert<egg...@cs.ucla.edu>  writes:

Come to think of it, looking at gnulib memxfrm gave me an idea
to improve the performance of GNU sort by bypassing the need for an
memxfrm-like function entirely.  I pushed a patch to do that at
<http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=2b49b140cc13cf36ec5ee5acaca5ac7bfeed6366>.

I don't know this code at all, but would your approach lead to problems
if two different strings have the same MD5 hash?  MD5 is broken, and
finding collisions takes just seconds on normal PC.  See:
http://en.wikipedia.org/wiki/MD5#Security

MD5 is used simply as a kind of "random key generator", so it doesn't matter. I wonder two things however:

1) why bother with memxfrm as a tie-breaker? isn't memcmp good enough?

2) maybe there's something cheaper than md5 that can be used? For example you could compare a^x and b^x where x is the output of a fast 32-bit random number generator? It doesn't need to be cryptographic, I'd pick http://en.wikipedia.org/wiki/Xorshift.

Paolo

Reply via email to