somandal commented on PR #15266:
URL: https://github.com/apache/pinot/pull/15266#issuecomment-2779453530

   Just capturing a scenario here that I have been debugging lately:
   
   Scenario:
   - When we try to update the IdealState in the rebalance code, if the node 
version changed an exception is thrown and we try to redo the whole rebalance 
loop
   - The current code updates the stats prior to the IdealState update (ideal 
state trigger and next assignment trigger)
   - If we are unable to update the IdealState we should rollback the stats 
since they will be recalculated on the next iteration anyways.
   - The older stats are also susceptible to this problem
   
   Impact of not rolling back: the stats can show more progress has been made 
than the reality for that iteration until the IS update happens. The IS update 
will reset it correctly, but until then the stats will be incorrect.
   
   Also, the above problem is more problematic if bestEfforts=true because we 
may not wait for EV-IS full convergence before moving to the next step. Due to 
this, when we loop back after IS update fails, we get into the EV-IS 
convergence loop again and it looks like we are making progress, but we’re 
actually making progress on the last step updates that hadn't completed.
   
   I’m thinking of adding the above rollback logic in a follow up PR (my plan 
is to just cache the previous versions of the stats and use that for rollback) 
to avoid over-complicating this one. Let me know if that is fine with you folks 
@klsince @raghavyadav01 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to