[ https://issues.apache.org/jira/browse/HBASE-26023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang reopened HBASE-26023: ------------------------------- > tableSkewCostFunction aggregate cost per table incorrectly > ---------------------------------------------------------- > > Key: HBASE-26023 > URL: https://issues.apache.org/jira/browse/HBASE-26023 > Project: HBase > Issue Type: Sub-task > Components: Balancer, test > Reporter: Clara Xiong > Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.6, 2.4.5 > > > There is another bug in the original tableSkew cost function for aggregation > of the cost per table: > If we have 10 regions, one per table, evenly distributed on 10 nodes, the > cost is scale to 1.0. > The more tables we have, the closer the value will be to 1.0. The cost > function becomes useless. > All the balancer tests were set up with large numbers of tables with minimal > regions per table. This artificially inflates the total cost and trigger > balancer runs. With this fix on TableSkewFunction, we need to overhaul the > tests too. We also need to add tests that reflect more diversified scenarios > for table distribution such as large tables with large numbers of regions. > {code:java} > protected double cost() { > double max = cluster.numRegions; > double min = ((double) cluster.numRegions) / cluster.numServers; > double value = 0; > for (int i = 0; i < cluster.numMaxRegionsPerTable.length; i++) { > value += cluster.numMaxRegionsPerTable[i]; > } > LOG.info("min = {}, max = {}, cost= {}", min, max, value); > return scale(min, max, value); > } > }{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)