AlexanderKM commented on PR #16900:
URL: https://github.com/apache/pinot/pull/16900#issuecomment-3335827655

   @Jackie-Jiang you are totally right. Upon further investigation, it looks 
like this is happening during controller shut down, and the 
ZkInterruptedException is quite intentional on the client side when the Helix 
Manager is called to disconnect 
https://github.com/apache/pinot/blob/master/pinot-controller/src/main/java/org/apache/pinot/controller/BaseControllerStarter.java#L1035
   
   ZkHelixManager 
[source](https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixManager.java#L844)
   ZkClient 
[source](https://github.com/apache/helix/blob/master/zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/ZkClient.java#L2823-L2882)
   
   I need to go back and think how to solve this from that point of view, as we 
won't be able to read from zk (or delete for that matter) since the client is 
disconnected and won't be reconnected since the Controller is in shutdown mode. 
The solution may need a more graceful shutdown to solve this timeline:
   
   1. Request comes into controller to process upload of new segment
   2. Controller begins to [shut 
down](https://github.com/apache/pinot/blob/master/pinot-controller/src/main/java/org/apache/pinot/controller/BaseControllerStarter.java#L1011-L1029),
 helix manager is 
[disconnected](https://github.com/apache/pinot/blob/master/pinot-controller/src/main/java/org/apache/pinot/controller/BaseControllerStarter.java#L1035)
 while IS update is still trying to be updated
   3. Inside the IS update retry loop, we hit 
[ZkInterruptedException](https://github.com/apache/pinot/blob/master/pinot-common/src/main/java/org/apache/pinot/common/utils/helix/IdealStateGroupCommit.java#L250-L264)
 (cannot recover here because the client is disconnected)
   
   Perhaps we can attempt to let remaining updates go through with some fixed 
timeout


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to