somandal opened a new pull request, #15681:
URL: https://github.com/apache/pinot/pull/15681

   Today when `SegmentRelocator` runs, it issues a table rebalance request for 
each table without checking whether the last rebalance it had issued completed 
or not. For small rebalances that move a few segments, this is usually okay, 
since we expect that the previous rebalance triggered by `SegmentRelocator` 
completes quickly. Sometimes it can happen that a large rebalance is issued, or 
rebalance takes a long time to complete for other reasons. In such cases, the 
`SegmentRelocator` should avoid issuing a new a table rebalance request for the 
given table.
   
   We saw an issue where there was a long table rebalance started by 
`SegmentRelocator` that took multiple hours to finish. In spite of that, every 
hour a new table rebalance job was created, and that rebalance job would land 
up running in parallel for the same table. This adds CPU load to the 
controllers as each table rebalance loops and runs an EV-IS convergence check.
   
   **Note:** this PR does not address scenarios where a long running table 
rebalance is triggered outside of `SegmentRelocator` and `SegmentRelocator` 
creates a new table rebalance request for that table. This only addresses the 
rebalances triggered by `SegmentRelocator`. If we want to address this across 
all rebalances, we need to come up with a design to address this since today we 
allow multiple rebalances to run in parallel for a given table and we expect 
idempotent results. 
   
   One low-hanging fruit might be to have the `SegmentRelocator` check if there 
are any user issued rebalance jobs by checking ZK to see if any IN_PROGRESS 
rebalance jobs exist. If this is a good idea, I can open a new PR to address 
this separately.
   
   **Testing:**
   - Manually tested with a short run frequency of `SegmentRelocator` to ensure 
that if it triggers a rebalance, and that rebalance takes longer to complete, 
it does not create a new rebalance job for that table. On the other hand, if 
the rebalance job completes, the next `SegmentRelocator` run does create a new 
rebalance job.
   - Also manually tested single table mode to ensure nothing breaks there


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to