jasperjiaguo commented on PR #11496:
URL: https://github.com/apache/pinot/pull/11496#issuecomment-1724444153

   > @jasperjiaguo This is resource starvation instead of deadlock then. During 
your investigation do you see closing all connections necessary? Ideally we 
want to only close the connection that throws exception for the following 
reasons:
   > 
   > * Closing all connections will fail all on-the-fly queries
   > * Some connections might not request lots of memory (data not evenly 
distributed, expensive query only hits subset of servers)
   > * After releasing the connection throwing the exception, other connections 
might recover
   
   Yeah I phrased dealock as every channel actually got a (pretty even) portion 
of the memory allocation and waiting. 
   
   > Ideally we want to only close the connection that throws exception
   
   Yes that is also what I initially thought would work. But in the worst case 
we tested (150+ servers each sending some hundred MB response), some 
connections to these 150 servers would start to throw OOM exceptions first; 
once we close them, the others recovered very slow/ would still be blocked in 
10 minutes. In other words, once the OOM starts to happen, reseting connection 
only to channels throwing OOM not give us a prompt recover. 
   
   > Closing all connections will fail all on-the-fly queries
   
   I'm not sure what is the usecase on your side? One thing we might be able to 
do to limit the blast range is: for the OOM channels we fetch the current in 
fly query ids and just resetting the channels having those ids. However, this 
would introduces quite some complication to the error handling code; not sure 
if that's what we want to do here or we might invest some time in a finer 
grained solution as folks have suggested in the discussions above?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to