yashmayya opened a new pull request, #15990: URL: https://github.com/apache/pinot/pull/15990
- Fixes https://github.com/apache/pinot/issues/15683. - Adds a `TableRebalanceManager` as the sole entry-point to table rebalances inside the controller. Before starting a rebalance for a table, we will check ZK metadata for any in progress rebalance jobs for the same table. If one is found, the rebalance request is rejected. - The only rebalance entry-point that bypasses the table rebalance manager will be the `RebalanceTableCommand` / `PinotTableRebalancer` CLI based tool since that is run outside of the controller context. Although this component has also been refactored to store progress stats in ZK and disallow concurrent rebalances (previously, rebalances jobs triggered via this mechanism did not track progress stats in ZK). - This patch also ensures that rebalances are run on the thread pool configured by the controller config [controller.executor.rebalance.numThreads](https://github.com/apache/pinot/blob/c15440466a0032c5f74e55940792fb16cd719760/pinot-controller/src/main/java/org/apache/pinot/controller/ControllerConf.java#L302). Previously, this thread pool was only used for tenant rebalances for some reason, and user triggered table rebalances were run on the thread pool configured by `controller.executor.numThreads` which doesn't really make sense. - The only rebalances to run outside this thread pool will be the system initiated ones by the periodic controller job `SegmentRelocator`. Note that these rebalances also do not currently track progress stats in ZK, so users will be able to initiate rebalances even when one triggered by the `SegmentRelocator` is ongoing. The other way around will be prevented, however - i.e., `SegmentRelocator` triggered rebalances will fail if there's a user initiated rebalance ongoing (and the next run of `SegmentRelocator` after the existing rebalance completes will succeed). - One important thing to note is that if multiple rebalance requests are issued simultaneously for a table, it's still possible to get a situation where there is more than one active rebalance job for a table. This is because the progress stats are written to ZK after some pre-checks are done and the possibility of this race condition is natural. Rebalances are idempotent, however, and we aren't too worried about handling this edge case situation. We just want to ensure that in most regular scenarios, we can prevent users from initiating rebalances for a table that is already undergoing rebalance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org