[issue34751] Hash collisions for tuples

Raymond Hettinger Thu, 20 Sep 2018 08:07:12 -0700


Raymond Hettinger <raymond.hettin...@gmail.com> added the comment:


> It's not hard to come up with a hash collision for tuples:

Nor is it hard to come up with collisions for most simple hash functions.  For 
typical use cases of tuples, what we have works out fine (it just combined the 
effects of the underlying hashes).

> The underlying reason is that the hashing code mixes ^ and *. 

Addition also has a relationship to multiplication.   I think the XOR is fine.

> A simple Bernstein hash suffices.

That is more suited to character data and small hash ranges (as opposed to 
64-bit).  

>  On top of that, the hashing code for tuples looks 
> needlessly complicated.

The important thing is that it has passed our testing (see the comment above) 
and that it compiles nicely (see the tight disassembly below).


----------------------

L82:
    xorq    %r14, %rax
    imulq   %rbp, %rax
    leaq    82518(%rbp,%r12,2), %rbp
    movq    %r13, %r12
    movq    %rax, %r14
    leaq    -1(%r13), %rax
    cmpq    $-1, %rax
    je  L81
    movq    %rax, %r13
L73:
    addq    $8, %rbx
    movq    -8(%rbx), %rdi
    call    _PyObject_Hash
    cmpq    $-1, %rax
    jne L82

----------
components: +Interpreter Core
nosy: +rhettinger
resolution:  -> rejected
stage:  -> resolved
status: open -> closed
type:  -> performance
versions: +Python 3.8

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34751] Hash collisions for tuples

Reply via email to