tristantarrant commented on issue #3399: URL: https://github.com/apache/logging-log4j2/issues/3399#issuecomment-2623778674
Yes: our testsuite has a test-killer which detects if its been unresponsive after a timeout. It will get a threaddump of the JVM and then kill itself. I believe I've figured out why it's happening a lot to our testsuite: * Some of our code gets a logger on class instantiation instead of in a static initializer. The assumption is that the log factory will give us a cached instance of a previously used logger with the same name. * Our testsuite is quite heavy, with many threads and triggers GC quite frequently * GC will collect those per-instance loggers, causing the `WeakReference`s in `InternalLoggerRegistry` to become null. Static loggers will never be GCed unless the class that owns them is unloaded. * The current logic in `InternalLoggerRegistry.computeIfAbsent` will always acquire a write lock without actually replacing the null `WeakReference`s. I've identified one place in our code which was unnecessarily obtaining a logger in a method and replaced it with a static instance. This has reduced the lock contention considerably, but doesn't solve the issue. I've created an Infinispan branch with my small fix which you can use for your investigations. Here are the instructions: ``` git clone -b log4j-lock-investigation --single-branch g...@github.com:tristantarrant/infinispan.git cd infinispan ./mvnw install -DskipTests -am -pl core ./mvnw verify -pl core ``` Just let it run. After a while it will hang and start creating `threaddump-XXX.log` files in the `core` directory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@logging.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org