On Tue, Jul 05, 2005 at 07:05:16PM +0200, Paolo Carlini wrote: > Michael Veksler wrote: > > > std::tr1::hash<dobule> is implemented in a very bad way. > > it casts double to size_t, which of course does a very poor job on big > > values (is the result of 1.0e100 cast to size_t defined ?). > > > > > A possible solution would be using frexp & co to extract the mantissa > and then work on it, one way or the other. You can find it proposed > around. Then, often the exponent is simply discarded. > > What do you think?
Close, but not quite. Hash functions are, by nature, many-to-one. A good hash function has few collisions for values that frequently appear; the program will preform very poorly if many inputs hash to the same value. The existing function will make all values over max(size_t) collide (assuming the cast clips). Using only the mantissa is better, but if doubles are used to represent fixed-point, then common values like 0.25, 0.5, 1, 2, 4 might all be in the hash table, and they will collide. You could do frexp, scale the mantissa to form an integer (e.g. multiply by a large integer), then add it (modulo 2**n) to some prime number multiplied by the exponent. This should give good spreading for both large values and exactly representable values.