Re: Load balancing issue with virtual nodes

2014-04-29 Thread DuyHai Doan
Thanks you Ben for the links On Tue, Apr 29, 2014 at 3:40 AM, Ben Bromhead wrote: > Some imbalance is expected and considered normal: > > See http://wiki.apache.org/cassandra/VirtualNodes/Balance > > As well as > > https://issues.apache.org/jira/browse/CASSANDRA-7032 > > Ben Bromhead > Instac

Re: Load balancing issue with virtual nodes

2014-04-28 Thread Ben Bromhead
Some imbalance is expected and considered normal: See http://wiki.apache.org/cassandra/VirtualNodes/Balance As well as https://issues.apache.org/jira/browse/CASSANDRA-7032 Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359 On 29 Apr 2014, at 7:30 am, DuyHai Doan w

Re: Load balancing issue with virtual nodes

2014-04-28 Thread DuyHai Doan
Hello all Some update about the issue. After wiping completely all sstable/commitlog/saved_caches folder and restart the cluster from scratch, we still experience weird figures. After the restart, nodetool status does not show an exact balance of 50% of data for each node : Status=Up/Down |/

Re: Load balancing issue with virtual nodes

2014-04-24 Thread Batranut Bogdan
I don't know about hector but the datastax java driver needs just one ip from the cluster and it will discover the rest of the nodes. Then by default it will do a round robin when sending requests. So if Hector does the same the patterb will againg appear. Did you look at the size of the dirs? T

Re: Load balancing issue with virtual nodes

2014-04-24 Thread DuyHai Doan
I did some experiments. Let's say we have node1 and node2 First, I configured Hector with node1 & node2 as hosts and I saw that only node1 has high CPU load To eliminate the "client connection" issue, I re-test with only node2 provided as host for Hector. Same pattern. CPU load is above 50% on

Re: Load balancing issue with virtual nodes

2014-04-24 Thread Batranut Bogdan
Htop is not the only tool for this . Cassandra will hit io bottlnecks before cpu (on faster cpus) . A simple solution is to check the size of the data dir on the boxes. If you have aprox the same size then cassandra is wrinting in the whole cluster. Check how the data dir size changes when impor

Re: Load balancing issue with virtual nodes

2014-04-24 Thread Michael Shuler
On 04/24/2014 10:29 AM, DuyHai Doan wrote: Client used = Hector 1.1-4 Default Load Balancing connection policy Both nodes addresses are provided to Hector so according to its connection policy, the client should switch alternatively between both nodes OK, so is only one connection being e

Re: Load balancing issue with virtual nodes

2014-04-24 Thread DuyHai Doan
Hello Michael RF = 1 Client used = Hector 1.1-4 Default Load Balancing connection policy Both nodes addresses are provided to Hector so according to its connection policy, the client should switch alternatively between both nodes Regards Duy Hai DOAN On Thu, Apr 24, 2014 at 4:37 PM, Mic

Re: Load balancing issue with virtual nodes

2014-04-24 Thread Michael Shuler
On 04/24/2014 09:14 AM, DuyHai Doan wrote: My customer has a cluster with 2 nodes only. I've set virtual nodes so future addition of new nodes will be easy. with RF=? Now, after some benching tests with massive data insert, I can see with "htop" that one node has its CPU occupation up to

Load balancing issue with virtual nodes

2014-04-24 Thread DuyHai Doan
Hello all I'm facing a rather weird issue with virtual nodes. My customer has a cluster with 2 nodes only. I've set virtual nodes so future addition of new nodes will be easy. Now, after some benching tests with massive data insert, I can see with "htop" that one node has its CPU occupation u