Eric Blake wrote: ... > the table is certainly exceeding the requested threshold of 20% empty buckets. > Ouch.
Ouch, indeed! > In other words, this statement in hash_rehash: > > /* If the growth threshold of the buckets in use has been reached, increase > the table size and rehash. There's no point in checking the number of > entries: if the hashing function is ill-conditioned, rehashing is not > likely to improve it. */ > > is not quite right. A hashing function may be somewhat ill-conditioned for > one > bucket size, but much better conditioned for another size. At any rate, it > looks like the correct fix is to check for rehashing thresholds based on > number > of entries, not number of buckets, and that this check needs to happen on > every > insertion, not just insertions that land in an empty bucket. Or, at the very > least, check for threshold PRIOR to insertion, rather than after, such that in > the above case with m4, the very next call to hash_insert would notice the > table was at 100%, and force another resize above 907 buckets. > > Thoughts, before I implement something along these lines? Sounds good.