Eric Blake wrote:
...
> the table is certainly exceeding the requested threshold of 20% empty buckets.
> Ouch.

Ouch, indeed!

> In other words, this statement in hash_rehash:
>
>   /* If the growth threshold of the buckets in use has been reached, increase
>      the table size and rehash.  There's no point in checking the number of
>      entries:  if the hashing function is ill-conditioned, rehashing is not
>      likely to improve it.  */
>
> is not quite right.  A hashing function may be somewhat ill-conditioned for 
> one
> bucket size, but much better conditioned for another size.  At any rate, it
> looks like the correct fix is to check for rehashing thresholds based on 
> number
> of entries, not number of buckets, and that this check needs to happen on 
> every
> insertion, not just insertions that land in an empty bucket.  Or, at the very
> least, check for threshold PRIOR to insertion, rather than after, such that in
> the above case with m4, the very next call to hash_insert would notice the
> table was at 100%, and force another resize above 907 buckets.
>
> Thoughts, before I implement something along these lines?

Sounds good.


Reply via email to