Load balancing - Order preserving partition - New Keys prob.
Hi all, Thanks you for your comments. I already read some papers about load balancing in P2P which some of them allow range query. During this time, I find another problem: "NEW KEYS" Karger - Simple efficient load balancing algorithms for p2p systems. (support OPP) John Byers - Simple load balancing for distributed hash table. Third: Ananth Rao - Load balancing in structured P2P systems. I think the first paper is the best to get the background. Prefix Hash Tree: An Indexing Data Structure over Distributed Hash Tables and Range queries over DHTs These paper use PHT to allow range query. James Aspnes - Skip Graphs (most recent as I known) and Distributed Balanced Tables, Not making a hash of it All. All of these papers deal with load balancing and range query probs by creating schemes or strategies based on only "load - number of key on each nodes", not care about new keys and highly-accessed keys. HOWEVER, in fact, "the most recent keys are likely accessed more frequently". I suppose, we have 400.000 "NEW KEYS" in 2 recent days (are likely accessed more frequently). --> A scheme: these new keys are uniformly partitioned and divided into some successive nodes. For example, there are 10 node (N1...N10), keys in day 1 will be put into node_1_2, keys in day 2 will be put into node_3_4 and so on... But, the prob is that number of keys and nodes increase/decrease day by day (not fixed) --> the number of node used to store keys in each day may increase/decrease (1 or 3 node for example). Does any one have any ideas or know papers to deal with this prob (new keys and highly-accessed keys)? Thanks a lot.
Re: when will cassandra 0.7 be realeased?
I don't recommend using the betas for anything but testing. We should see 0.7 final in October. On Wed, Sep 15, 2010 at 10:27 PM, Chen Xinli wrote: > Hi, > > We are going to use cassandra in our production env, and want to use the > feature defining keyspace on the fly. > When the 0.7 version will be released? or just beta is ok? > > Thanks. > > -- > Best Regards, > Chen Xinli > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Load balancing - Order preserving partition - New Keys prob.
If new keys don't follow roughly the same distribution as the ones you already have, then yes, you're going to be load balancing a lot. On Thu, Sep 16, 2010 at 2:24 AM, Hien. To Trong wrote: > Hi all, > Thanks you for your comments. > I already read some papers about load balancing in P2P which some of them > allow range query. > During this time, I find another problem: "NEW KEYS" > > Karger - Simple efficient load balancing algorithms for p2p systems. > (support OPP) > John Byers - Simple load balancing for distributed hash table. > Third: Ananth Rao - Load balancing in structured P2P systems. > I think the first paper is the best to get the background. > > Prefix Hash Tree: An Indexing Data Structure over Distributed Hash Tables > and Range queries over DHTs > These paper use PHT to allow range query. > > James Aspnes - Skip Graphs (most recent as I known) > and Distributed Balanced Tables, Not making a hash of it All. > > All of these papers deal with load balancing and range query probs by > creating schemes or strategies > based on only "load - number of key on each nodes", not care about new keys > and highly-accessed keys. > HOWEVER, in fact, "the most recent keys are likely accessed more > frequently". > > I suppose, we have 400.000 "NEW KEYS" in 2 recent days (are likely accessed > more frequently). --> A scheme: these new keys are uniformly partitioned and > divided into some successive nodes. For example, there are 10 node > (N1...N10), keys in day 1 will be put into node_1_2, keys in day 2 will be > put into node_3_4 and so on... But, the prob is that number of keys and > nodes increase/decrease day by day (not fixed) --> the number of node used > to store keys in each day may increase/decrease (1 or 3 node for example). > > Does any one have any ideas or know papers to deal with this prob (new keys > and highly-accessed keys)? > > Thanks a lot. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: when will cassandra 0.7 be realeased?
Thanks Jonathan. I can do testing with beta, then upgrade to final version in Oct. 2010/9/17 Jonathan Ellis > I don't recommend using the betas for anything but testing. > > We should see 0.7 final in October. > > On Wed, Sep 15, 2010 at 10:27 PM, Chen Xinli wrote: > > Hi, > > > > We are going to use cassandra in our production env, and want to use the > > feature defining keyspace on the fly. > > When the 0.7 version will be released? or just beta is ok? > > > > Thanks. > > > > -- > > Best Regards, > > Chen Xinli > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > -- Best Regards, Chen Xinli