jasperjiaguo commented on PR #11496: URL: https://github.com/apache/pinot/pull/11496#issuecomment-1724444153
> @jasperjiaguo This is resource starvation instead of deadlock then. During your investigation do you see closing all connections necessary? Ideally we want to only close the connection that throws exception for the following reasons: > > * Closing all connections will fail all on-the-fly queries > * Some connections might not request lots of memory (data not evenly distributed, expensive query only hits subset of servers) > * After releasing the connection throwing the exception, other connections might recover Yeah I phrased dealock as every channel actually got a (pretty even) portion of the memory allocation and waiting. > Ideally we want to only close the connection that throws exception Yes that is also what I initially thought would work. But in the worst case we tested (150+ servers each sending some hundred MB response), some connections to these 150 servers would start to throw OOM exceptions first; once we close them, the others recovered very slow/ would still be blocked in 10 minutes. In other words, once the OOM starts to happen, reseting connection only to channels throwing OOM not give us a prompt recover. > Closing all connections will fail all on-the-fly queries I'm not sure what is the usecase on your side? One thing we might be able to do to limit the blast range is: for the OOM channels we fetch the current in fly query ids and just resetting the channels having those ids. However, this would introduces quite some complication to the error handling code; not sure if that's what we want to do here or we might invest some time in a finer grained solution as folks have suggested in the discussions above? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org