The run method on AcceptorImpl is run in a LoggingThread instance (see 
AcceptorImpl.start()). So any exceptions thrown by AcceptorImpl.run() will be 
logged as a fatal log message containing ""Uncaught exception in thread" by the 
LoggingThread. You can see the code that does this in 
LoggingUncaughtExceptionHandler.

Also the AcceptorImpl.run() method I see has a finally block in which it closes 
"serverSocket" if it is not null. It is on this close call that it catches and 
ignores IOException.
I think you may be talking about AcceptorImpl.ClientQueueInitializerTask.run(). 
This run is called from an executor which is created in 
initializeClientQueueInitializerThreadPool. It uses CoreLoggingExecutors so 
once again any unhandled exception should be logged as a fatal log msg in the 
server log.
________________________________
From: Leon Finker <leon...@gmail.com>
Sent: Wednesday, November 10, 2021 10:01 AM
To: dev@geode.apache.org <dev@geode.apache.org>
Subject: Failed durable client connection initialization can sometimes leak 
client socket handle?

Hi,

In AcceptorImpl.run, the accepted client socket seems to only be
closed when there is IOException. I can't prove it, but I think there
can sometimes be non IO exception here as well and then the client
socket will not be closed? Also, can we please add a catch for other
kinds of exceptions and at least log them as errors?

The symptoms we have are like this:
1. Durable client has a connection problem during initialization.
2. Durable client ends up with orphaned durable HA region (the one
prefixed with_gfe_durable_client_with_id_)
3. Now the client automatically reconnects and the geode server fails
to properly initialize the client. Most likely because the region
already has an error. If inspecting the regions at runtime, we indeed
can see durable region for the client without CacheClientProxy
properly created and added to the proxies collection.
4. We observe a pretty rapid (over few days) memory leak and socket handles leak
5. This leak stops as soon as we destroy that internal durable region
(partially through reflection) for the client and client can then
properly reconnect and initialize its region and proxy.

Does this ring any bells for anyone?

Thank you

Reply via email to