J-HowHuang opened a new pull request, #15618:
URL: https://github.com/apache/pinot/pull/15618

   ## Description
   It is usually difficult to decide the timeout 
(`externalViewStabilizationTimeoutInMs`). Consequently, some larger tables fail 
to finish a rebalance job because they take longer than normal and need to be 
manually re-triggered.
   
   ## Change in PR
   This PR tracks the number of remaining segments to process in the current 
EV-IS convergence, and checks this number each time the timeout has been 
reached. If the number is lower than last time it checked, another new session 
for timeout is granted to carry out the EV-IS convergence, otherwise the 
timeout exception is thrown as what it does now.
   
   For job with `lowDiskMode=true`, the number is the sum of remaining segments 
to be added and to be deleted. For `lowDiskMode=false` it's the number of 
remaining segments to be added, as the convergence check only look for these 
segments.
   
   ## Issue
   This change only applies to rebalance jobs triggered from controller API, 
other rebalance jobs like the periodic `segmentRelocator` does not have 
`ZkBasedTableRebalanceObserver` passed to the `TableRebalancer` and thus no 
progress is tracked so as to enable the dynamic timeout.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to