rmdmattingly opened a new pull request, #6651: URL: https://github.com/apache/hbase/pull/6651
Finally, a big PR here. This adds the balancer conditional framework and our first conditional implementation: replica distribution. This is an improvement on existing cost-based replica distribution for reasons that I'll dig into further. See my design doc [here](https://docs.google.com/document/d/1jA8Ghs86v7b-53j5DcsdbPnOXxbHjewkIBFi1E4S1pY/edit?usp=sharing). You can enable conditional replica distribution via `hbase.master.balancer.stochastic.conditionals.distributeReplicas`: set this to true to enable the feature. ### Improvements on Replica Balancing **Primary replica balancing squashes all other considerations**. The default weight for one of the several cost functions that factor into primary replica balancing is 100,000. Meanwhile the default read request cost is 5. The result is that the load balancer, OOTB, basically doesn't care about balancing actual load. To solve this, you can either set primary replica balancing costs to zero, which is fine if you don't use read replicas, or — if you do use read replicas — maybe you can produce a magic incantation of configurations that work _just_ right, until your needs change. Conditionals provide an alternative which works much more cleanly in relation to all of the other considerations that you would like your balancer to have. **Replica cost functions don't balance secondary replicas effectively**. While they'll calculate imbalance costs necessary to balance primary replicas away from secondary replicas, there is no sufficient mechanism in the existing cost functions to distribute secondary replicas appropriately. So using >2 replicas on a table has a pretty dubious value proposition. On the other hand, this conditional implementation will ensure that secondary replicas are distributed to the greatest extent that the cluster allows. Even on a relatively tiny minicluster test I was unable to demonstrate that cost-based replica balancing could distribute a 3 replica table perfectly:   **….omitting the meaningless snapshots between 4 and 27…**  Meanwhile, conditional based replica balancing solved this imbalance effectively:      ### Testing I've written a minicluster test to validate that conditional replica balancing works on a small cluster locally, and I've written a larger scale test that mocks the StochasticLoadBalancer in hbase-balancer. This test validates that conditional balancing performance is acceptable; even at a huge scale, even with default balancer costs (which other large scale cost-based replica balancing tests have had to compromise), and even with strict consideration for secondary replicas cc @ndimiduk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
