ankitsultana commented on issue #12390:
URL: https://github.com/apache/pinot/issues/12390#issuecomment-1936819688

   The exact issue is described below (all of these are confirmed via logs):
   
   * Server has a full GC which leads to ZK and Helix disconnection.
   * When ZK reconnects, a bunch of OFFLINE to CONSUMING messages are sent
   * We see the exception above.
   
   **Current Theory**: When ZK disconnection happens, the PartitionConsumer 
thread is still alive and holding the semaphore, and so when Helix sends 
OFFLINE to CONSUMING transition again for that segment, the Segment Data 
Manager fails to init.
   
   I am low on time right now so can't dig deeper. Wondering if anyone can hint 
at some potential solutions.
   
   I had also seen this somewhat related issue from a few years ago: 
https://github.com/apache/pinot/issues/7874
   
   cc: @Jackie-Jiang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to