somandal opened a new pull request, #16341:
URL: https://github.com/apache/pinot/pull/16341

   We recently identified some potential data loss scenarios for peer-download 
enabled tables when downtime rebalance is performed. This can happen when the 
segment is marked as `DONE` in the `SegmentZKMetadata` but the 
`segmentDownloadUrl` is empty due to failures to upload the segment to deep 
store. Similar issues exist for upsert / dedup enabled tables with pauseless 
enabled even for segments in COMMITTING state which run into segment build 
failures.
   
   This PR adds the following changes to TableRebalancer:
   - Adds a rebalance pre-check to identify if the table is peer-download 
enabled and WARN if downtime rebalance or minAvailableReplicas = 0 is set in 
the `RebalanceConfig`. (and removes the existing on which is limited to 
pauseless)
   - Adds code to disallow `downtime=true` or `minAvailableReplicas=0` for 
peer-download enabled tables
   - Adds a `forceDowntime` flag which will allow forcing rebalance to continue 
if `downtime=true` or `minAvailableReplicas=0` for peer-download enabled 
tables. This is to be used with extreme caution and only after the following 
steps have been taken:
       - Ensure all segments have been uploaded to deep store
       - Pause ingestion for the duration of rebalance to ensure no new 
segments are created
   - Fail the rebalance if a segment is found for a peer-download enabled table 
which if moved can result in possible data loss:
       - Completed (i.e. `DONE`) but with empty download URL
       - Not completed but upsert / dedup table with pauseless enabled - this 
is to prevent the scenario where during rebalance the segment gets into 
COMMITTING state and then has a build failure (since rebalance and segment 
commit can happen in parallel there could be races if we limit to looking at 
segments only in COMMITTING state)
   
   cc @noob-se7en @Jackie-Jiang @yashmayya @npawar 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to