Hi Joe and Gaby and thanks for your messages, >Close, but not quite. > >Hash functions are, by nature, many-to-one. A good hash function has >few collisions for values that frequently appear; the program will preform >very poorly if many inputs hash to the same value. The existing function >will make all values over max(size_t) collide (assuming the cast clips). >Using only the mantissa is better, but if doubles are used to >represent fixed-point, then common values like 0.25, 0.5, 1, 2, 4 might >all be in the hash table, and they will collide. > >You could do frexp, scale the mantissa to form an integer (e.g. multiply >by a large integer), then add it (modulo 2**n) to some prime number >multiplied by the exponent. This should give good spreading for both >large values and exactly representable values. > Indeed, I can find around also more sophisticated solutions similar to your proposal, perhaps in the python library?!? I'm tempted to go along this way, but Gaby's idea also seems nice, opinions about it? We have to make a choice... or we can provide both and the user chooses... still we have to select a default ;)
Thanks