[ 
https://issues.apache.org/jira/browse/HBASE-26023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-26023:
-------------------------------

> tableSkewCostFunction aggregate cost per table incorrectly
> ----------------------------------------------------------
>
>                 Key: HBASE-26023
>                 URL: https://issues.apache.org/jira/browse/HBASE-26023
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Balancer, test
>            Reporter: Clara Xiong
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.3.6, 2.4.5
>
>
> There is another bug in the original tableSkew cost function for aggregation 
> of the cost per table:
> If we have 10 regions, one per table, evenly distributed on 10 nodes, the 
> cost is scale to 1.0.
> The more tables we have, the closer the value will be to 1.0. The cost 
> function becomes useless.
> All the balancer tests were set up with large numbers of tables with minimal 
> regions per table. This artificially inflates the total cost and trigger 
> balancer runs. With this fix on TableSkewFunction, we need to overhaul the 
> tests too. We also need to add tests that reflect more diversified scenarios 
> for table distribution such as large tables with large numbers of regions.
> {code:java}
> protected double cost() {
>  double max = cluster.numRegions;
>  double min = ((double) cluster.numRegions) / cluster.numServers;
>  double value = 0;
>  for (int i = 0; i < cluster.numMaxRegionsPerTable.length; i++) {
>  value += cluster.numMaxRegionsPerTable[i];
>  }
>  LOG.info("min = {}, max = {}, cost= {}", min, max, value);
>  return scale(min, max, value);
>  }
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to