Re: [PR] Table rebalance changes for handling peer-download enabled tables [pinot]

via GitHub Thu, 17 Jul 2025 11:03:26 -0700


somandal commented on PR #16341:
URL: https://github.com/apache/pinot/pull/16341#issuecomment-3084962096


   > @somandal The `downloadUrl` change could only be from empty (peer download 
implied) to deep store url right? If we check all the segments once before the 
rebalance start, instead of every step when next assignment is called, we might 
get false positive for `hasDataLossRisk` but not false negative. This should be 
acceptable?
   > 
   > I don't know how much this could help in terms of the performance, but it 
can then reduce the access to one time. Nevertheless if the overhead of the ZK 
metadata access for each segment is not that significant I think it shouldn't 
be a problem.
   > 
   > Also, do we need a `forceDowntime` flag if we already warned the user by 
pre-check? Imo if they did a non-dry-run, they should've learned the risk of 
doing so, therefore this second security door seems a bit redundant to me. Plus 
we have `DataLossRiskAssessor` to prevent risk, so setting `forceDowntime` to 
`true` doesn't even make the rebalance operation from safe to unsafe.
   
   **Regarding ZK access and performance:**
   
   My bigger concern is not `downloadUrl` but segment status. Considering that 
rebalance can take hours to run, it is possible that a segment starts off in 
CONSUMING state, but eventually gets committed. Or is in COMMITTING state but 
later goes into DONE, etc. So for those, IMO caching before the rebalance 
starts is not a good idea, as we might miss some data loss scenarios (say 
segment commit protocol starts but segment build fails or upload fails).
   
   What i can explore (and plan to do right away) is to see if I can avoid 
doing this check for every segment - and only do this check for segments that 
have a changed targetAssignment. Will explore this and update the code if I'm 
able to.
   
   Let me know your thoughts on this @J-HowHuang and whether my explanation of 
the segment status makes sense on why i'm wary of just checking the ZK metadata 
once.
   
   **Regarding `forceDowntime` flag:**
   
   Yeah, I see your point. My main worry is that people might skip pre-check 
(since it's not really mandatory) and directly try to run rebalance. In that 
scenario, this is an additional safeguard. 
   
   Option 1 -
   I was also wondering if the following makes sense:
   - Remove `forceDowntime` and all checks that disallow rebalance of 
peer-download tables for `downtime=true` or `minAvailableReplicas=0`. Instead 
rely on rebalance being failed if we hit a potential data loss scenario
   - Utilize `bestEfforts` flag (or add a new flag just for this peer download 
failure scenario) to log a warning and skip failing the rebalance (in case 
people are okay with data loss)
   
   Option 2 -
   Remove `forceDowntime` flag. It won't make sense to disallow rebalance for 
peer-download tables for downtime at all, and we just fall back to relying on 
`DataLossRiskAssessor`. So remove the flag validations for rebalance as well. 
We will never allow rebalance to complete if a data loss scenario is detected.
   
   For option 2, are the safeguards of `DataLossRiskAssessor` enough in case 
someone doesn't run pre-checks and just does a downtime rebalance?
   
   I think Option 2 probably makes more sense, but let me know your thoughts 
@J-HowHuang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Re: [PR] Table rebalance changes for handling peer-download enabled tables [pinot]

Reply via email to