J-HowHuang opened a new pull request, #15953:
URL: https://github.com/apache/pinot/pull/15953

   ## Description
   It was revealed a risk of data loss for pauseless tables during rebalance, 
when `downtime=true` or `minAvailableReplicas=0`. 
   If a segment is being moved and has not yet uploaded to deep store, 
premature deletion could cause irrecoverable data loss. 
   
   This PR introduces pre-checks and warnings as a workaround to mitigate such 
scenarios -- Add to the pre-check logic for table rebalancing when "pauseless 
ingestion" is enabled yet the rebalance parameters have downtime=true` or 
`minAvailableReplicas=0`, adding additional safety checks and warnings to 
prevent potential data loss.  
   
   
   ### Key Changes
   
   - **Enhance Pre-Check Logic:**  
     - Add warnings in the pre-check item `"rebalanceConfigOptions"` if:
       - Replication is 1 for pauseless tables (inevitably needs downtime, 
which may cause risk of data loss).
       - Downtime or `minAvailableReplicas=0` for pauseless tables.
   
   - **Testing:**  
     - Add/extend tests in `TableRebalancerClusterStatelessTest` to cover new 
warning scenarios and validate correct pre-check status/messages for pauseless 
tables.
   
   
   ## Tests
   
   ### Case 1: Pauseless table with `RF=1 -> RF=1`, rebalanced from 2 servers 
to 1 server, `downtime=false`, `minAvailableReplica=-1`
   
![image](https://github.com/user-attachments/assets/8365e521-e681-402a-bd9e-ceebfe912a7d)
   
   ### Case 2: Pauseless table with `RF=1 -> RF=2`, rebalanced from 1 servers 
to 2 server, `downtime=true`
   
![image](https://github.com/user-attachments/assets/12d3d671-915b-41ce-be3b-7480cb9747c6)
   
   ### Case 3: Pauseless table with `RF=1 -> RF=2`, rebalanced from 1 servers 
to 2 server, `minAvailableReplica=-2`
   
![image](https://github.com/user-attachments/assets/7e20d54a-0cff-4c69-ab52-709f372b8a6b)
   
   ### Case 4: Pauseless table with `RF=1 -> RF=2`, rebalanced from 1 servers 
to 2 server, `minAvailableReplica=-1`
   
![image](https://github.com/user-attachments/assets/a39302a4-615c-4425-bf0b-8bcd53144097)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to