Jim Meyering <jim <at> meyering.net> writes: > > /* If the growth threshold of the buckets in use has been reached, increase > > the table size and rehash. There's no point in checking the number of > > entries: if the hashing function is ill-conditioned, rehashing is not > > @@ -1057,6 +1031,32 @@ hash_insert (Hash_table *table, const void *entry) > > } > > } > > You can't use a pre-rehash "bucket" here (after rehash), > since it is no longer valid. > > Considering the cost of rehashing (and its relative infrequency), > one more hash_find_entry won't even show up on the radar.
Good catch. So the simple code motion was a bit too simple; here's the updated patch (and I've now run it through the same m4 test that found the original problem): From: Eric Blake <e...@byu.net> Date: Wed, 17 Jun 2009 13:01:41 -0600 Subject: [PATCH] hash: check for resize before insertion * lib/hash.c (hash_insert): Check whether bucket usage exceeds threshold before insertion, so that a pathological hash_rehash that fills every bucket can still trigger another rehash. Signed-off-by: Eric Blake <e...@byu.net> --- lib/hash.c | 58 +++++++++++++++++++++++++++++++--------------------------- 1 files changed, 31 insertions(+), 27 deletions(-) diff --git a/lib/hash.c b/lib/hash.c index 7d76d45..ed171eb 100644 --- a/lib/hash.c +++ b/lib/hash.c @@ -1,7 +1,7 @@ /* hash - hashing table processing. - Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2006, 2007 Free - Software Foundation, Inc. + Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2006, 2007, + 2009 Free Software Foundation, Inc. Written by Jim Meyering, 1992. @@ -917,30 +917,6 @@ hash_insert (Hash_table *table, const void *entry) if ((data = hash_find_entry (table, entry, &bucket, false)) != NULL) return data; - /* ENTRY is not matched, it should be inserted. */ - - if (bucket->data) - { - struct hash_entry *new_entry = allocate_entry (table); - - if (new_entry == NULL) - return NULL; - - /* Add ENTRY in the overflow of the bucket. */ - - new_entry->data = (void *) entry; - new_entry->next = bucket->next; - bucket->next = new_entry; - table->n_entries++; - return (void *) entry; - } - - /* Add ENTRY right in the bucket head. */ - - bucket->data = (void *) entry; - table->n_entries++; - table->n_buckets_used++; - /* If the growth threshold of the buckets in use has been reached, increase the table size and rehash. There's no point in checking the number of entries: if the hashing function is ill-conditioned, rehashing is not @@ -967,10 +943,38 @@ hash_insert (Hash_table *table, const void *entry) /* If the rehash fails, arrange to return NULL. */ if (!hash_rehash (table, candidate)) - entry = NULL; + return NULL; + + /* Update the bucket we are interested in. */ + if (hash_find_entry (table, entry, &bucket, false) != NULL) + abort (); } } + /* ENTRY is not matched, it should be inserted. */ + + if (bucket->data) + { + struct hash_entry *new_entry = allocate_entry (table); + + if (new_entry == NULL) + return NULL; + + /* Add ENTRY in the overflow of the bucket. */ + + new_entry->data = (void *) entry; + new_entry->next = bucket->next; + bucket->next = new_entry; + table->n_entries++; + return (void *) entry; + } + + /* Add ENTRY right in the bucket head. */ + + bucket->data = (void *) entry; + table->n_entries++; + table->n_buckets_used++; + return (void *) entry; } -- 1.6.3.2