Hi - My recent foray into gsync left me thinking a bit about its hash function:
#define MIX2_LL(x, y) ((((x) << 5) | ((x) >> 27)) ^ (y)) which seems a bit weak. The result of gsync_key_hash is then taken modulo GSYNC_NBUCKETS, which is currently 512. libihash has a similar issue. It seems to use no hash function by default, but provides Murmur3 if you want it, which looks better than just shifting bits, but also slower. I've been looking at the CRC32 instruction, which was introduced ten years ago in SSE 4.2. According to the Intel Software Developer's Manual, CRC32 works like this: Starting with an initial value in the first operand (destination operand), accumulates a CRC32 (polynomial 0x11EDC6F41) value for the second operand (source operand) and stores the result in the destination operand. The source operand can be a register or a memory location. The destination operand must be an r32 or r64 register. If the destination is an r64 register, then the 32-bit result is stored in the least significant double word and 00000000H is stored in the most significant double word of the r64 register. Agner Fog (www.agner.org) reports that this instruction executes in a single micro-op. It seems a bit silly to agonize over performance when we've got so many other issues, but I just wanted to shoot a message to the list to document my research. I'm thinking that CRC32 is our best bet to compute hash functions on the Intel architecture. Of course, it comes with issues like detecting whether the processor supports it. agape brent