AKA Consistent Hashing: http://en.wikipedia.org/wiki/Consistent_hashing
Michael Della Bitta ------------------------------------------------ Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Oct 8, 2012 at 11:33 AM, Radim Kolar <h...@filez.com> wrote: > Do it as it is done in cassandra database. Adding new node and > redistributing data can be done in live system without problem it looks like > this: > > every cassandra node has key range assigned. instead of assigning keys to > nodes like hash(key) mod nodes, then every node has its portion of hash > keyspace. They do not need to be same, some node can have larger portion of > keyspace then another. > > hash function max possible value is 12. > > shard1 - 1-4 > shard2 - 5-8 > shard3 - 9-12 > > now lets add new shard. In cassandra adding new shard by default cuts > existing one by half, so you will have > shard1 - 1-2 > shard2 3-4 > shard3 5-8 > shard4 9-12 > > see? You needed to move only documents from old shard1. Usually you are > adding more then 1 shard during reorganization, you do not need to rebalance > cluster by moving every node into different position in hash keyspace that > much.