Re: An idea to improve hash table expanding algorithm without a global lock

dormando Sun, 10 Aug 2014 23:32:39 -0700

Sorry, I haven't had time to approach the PR's recently.. Either way it's
a little complicated. Not much you can do about that while scaling data
structures.


On Sun, 10 Aug 2014, Zhiwei Chan wrote:

>   I also thought of the deadlock issue. It is not difficult to deal with, as 
> you have pointed out the solution, which is called
> "pessimistic locking strategy" in database system. It safe and a little slow, 
> but should be faster than the global lock.  In
> fact, I don't think the speed of expanding is very important. I think the 
> global lock is a little dangerous, as I have explained
> in issue 370 and reproduced at .20 in pull #73(maybe you still have no glance 
> at it), and i also think it is difficult to
> understand(I am so awkward that have to spend a lot of time to understand how 
> and why the global lock works).
>   I think it is another option to solve the issue 370, and make the code more 
> clear~
>
> On Monday, August 11, 2014 12:38:43 PM UTC+8, Dormando wrote:
>       >
>       >   I think we can expanding hash table without a "global lock", and i 
> am going to explain why(I hope my English is
>       good enough to make it clearly :) ).
>       >
>       >   As the current hash expanding algorithm doing,  now it is going to 
> expand the power of hashtable from hashpower
>       to hashpower+1, and moving old_bucket[0]~old_bucket[2**hashpower-1] to 
> new_bucket[0]~new_bucket[2**(hashpower+1)-1].
>       (2**x is pow(2,x)).
>       >
>       >   All the items in old_bucket[i] will only have two destination in 
> new_bucket. Because the old bucket-id i =
>       hash(key) & hashmask(hashpower), and the bucket-id in new_bucket is 
> new_i = hash(key) & hashmask(hashpower+1). It
>       means adding a highest bit to the i, so new_i = i + (0 or 
> 1)*(2**hashpower), so new_i = i   or   new_i = i +
>       2**hashpower.  
>       >  
>       >  For every bucket in the hash table that is going to expand,  we just 
> have to get two locks: item_locks[(i) %
>       item_lock_count] and item_locks[(i + 2**hashpower) % item_lock_count].  
> It seems no need a "global lock".
>       >   
>       >   Tell me if I make some mistake, and any suggestion is appreciated.
>       >   
>       >   reference: 
>       >   http://en.wikipedia.org/wiki/Linear_hashing
>       >   http://en.wikipedia.org/wiki/Extendible_hashing
>       >
>
>       Normally this wouldn't work because of deadlocks; you can't guarantee 
> that
>       two threads won't grab each other's locks?
>
>       If thread A tries to grab locks 1 and 2
>       Then thread B tries to grab locks 2 and 1 (Because the whole calculation
>       is a modulus against item_lock_count), they can deadlock.
>
>       The same deadlock exists even if the item locks and the hash buckets 
> both
>       expanded at the same time, and are the same size. Since two different 
> keys
>       can, at hashpower 20, occupy different buckets, then at hashpower 21,
>       occupy the same bucket. They'd still deadlock.
>
>       Since as of a few verisons ago item_locks uses item_lock_count instead 
> of
>       (stupidly, sorry) hashpower, you could maybe change the logic while 
> under
>       expansion to do trylock() instead of lock(). Sleep for random 
> nanoseconds
>       then try again if you fail to acquire the second lock? That would make 
> the
>       "unlikely" case of a collision resolve itself with a little slowdown,
>       perhaps.
>
>       However hash table expansion happens pretty quickly and people with
>       sensitive loads should still use the presize argument. It'd be nice to 
> be
>       rid of the global lock dance but I've definitely been putting it off in
>       favor of work that's easier to reason with :)
>
>       So, yeah. maybe something like that.
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: An idea to improve hash table expanding algorithm without a global lock

Reply via email to