Thank you Shawn, We are indeed using Tomcat, maxThreads was set to 2000 (Normally seen <600 active threads under load).
I attached the complete stack trace of http-bio-8443-exec-37460 below. The thread is marked as "Waiting on Condition", and does not mention any lock it's waiting for. Looking at the code of Category.callAppenders, the thread hangs in line 204. 198 public 199 void callAppenders(LoggingEvent event) { 200 int writes = 0; 201 202 for(Category c = this; c != null; c=c.parent) { 203 // Protected against simultaneous call to addAppender, removeAppender,... 204 synchronized(c) { 205 if(c.aai != null) { 206 writes += c.aai.appendLoopOnAppenders(event); 207 } 208 if(!c.additive) { 209 break; 210 } 211 } 212 } 213 214 if(writes == 0) { 215 repository.emitNoAppenderWarning(this); 216 } 217 } Of course, I understand this is Solr's mailing list and not log4j's. So wanted to know if this problem happens to be familiar in Solr, and whether a fix or some solution exists. Thanks, Arnon 3XMTHREADINFO "http-bio-8443-exec-37460" J9VMThread:0x00007FED88044600, j9thread_t:0x00007FE73E4D04A0, java/lang/Thread:0x00007FF267995468, state:CW, prio=5 3XMJAVALTHREAD (java/lang/Thread getId:0xA1AC9, isDaemon:true) 3XMTHREADINFO1 (native thread ID:0x17F8, native priority:0x5, native policy:UNKNOWN) 3XMTHREADINFO2 (native stack address range from:0x00007FEA9487B000, to:0x00007FEA948BC000, size:0x41000) 3XMCPUTIME CPU usage total: 55.216798962 secs 3XMHEAPALLOC Heap bytes allocated since last GC cycle=3176200 (0x307708) 3XMTHREADINFO3 Java callstack: 4XESTACKTRACE at org/apache/log4j/Category.callAppenders(Category.java:204) 4XESTACKTRACE at org/apache/log4j/Category.forcedLog(Category.java:391(Compiled Code)) 4XESTACKTRACE at org/apache/log4j/Category.log(Category.java:856(Compiled Code)) 4XESTACKTRACE at org/slf4j/impl/Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:498) 4XESTACKTRACE at org/apache/solr/common/SolrException.log(SolrException.java:109) 4XESTACKTRACE at org/apache/solr/handler/RequestHandlerBase.handleRequest(RequestHandlerBase.java:153(Compiled Code)) 4XESTACKTRACE at org/apache/solr/core/SolrCore.execute(SolrCore.java:1916(Compiled Code)) 4XESTACKTRACE at org/apache/solr/servlet/SolrDispatchFilter.execute(SolrDispatchFilter.java:780(Compiled Code)) 4XESTACKTRACE at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427(Compiled Code)) 4XESTACKTRACE at org/apache/solr/servlet/SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217(Compiled Code)) 4XESTACKTRACE at org/apache/catalina/core/ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241(Compiled Code)) 4XESTACKTRACE at org/apache/catalina/core/ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208(Compiled Code)) 4XESTACKTRACE at org/apache/catalina/core/StandardWrapperValve.invoke(StandardWrapperValve.java:220(Compiled Code)) 4XESTACKTRACE at org/apache/catalina/core/StandardContextValve.invoke(StandardContextValve.java:122(Compiled Code)) 4XESTACKTRACE at org/apache/catalina/core/StandardHostValve.invoke(StandardHostValve.java:171(Compiled Code)) 4XESTACKTRACE at org/apache/catalina/valves/ErrorReportValve.invoke(ErrorReportValve.java:102(Compiled Code)) 4XESTACKTRACE at org/apache/catalina/valves/AccessLogValve.invoke(AccessLogValve.java:950(Compiled Code)) 4XESTACKTRACE at org/apache/catalina/core/StandardEngineValve.invoke(StandardEngineValve.java:116(Compiled Code)) 4XESTACKTRACE at org/apache/catalina/connector/CoyoteAdapter.service(CoyoteAdapter.java:408(Compiled Code)) 4XESTACKTRACE at org/apache/coyote/http11/AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040(Compiled Code)) 4XESTACKTRACE at org/apache/coyote/AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607(Compiled Code)) 4XESTACKTRACE at org/apache/tomcat/util/net/JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/tomcat/util/net/SocketWrapper@0x00007FF253C18FA8, entry count: 1) 4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156(Compiled Code)) 4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626(Compiled Code)) 4XESTACKTRACE at org/apache/tomcat/util/threads/TaskThread$WrappingRunnable.run(TaskThread.java:61(Compiled Code)) 4XESTACKTRACE at java/lang/Thread.run(Thread.java:804(Compiled Code)) 3XMTHREADINFO3 No native callstack available for this thread From: Shawn Heisey <apa...@elyograg.org> To: solr-user@lucene.apache.org Date: 01/09/2015 06:09 PM Subject: Re: Solr cloud hangs, log4j contention issue observed On 9/1/2015 12:53 AM, Arnon Yogev wrote: > We have a Solr cloud (4.7) consisting of 5 servers. > At some point we noticed that one of the servers had a very high CPU and > was not responding. A few minutes later, the other 4 servers were > responding very slowly. A restart was required. > Looking at the Solr logs, we mainly saw symptoms, i.e. errors that happened > a few minutes after the high CPU started (connection timeouts etc). > > When looking at the javacore of the problematic server, we found that one > thread was waiting on a log4j method, and 538 threads (!) were waiting on > the same lock. > The thread's stack trace is: <snip> > Our logging is done to a local file. > After searching the web, we found similar problems: > https://bz.apache.org/bugzilla/show_bug.cgi?id=50213 > https://bz.apache.org/bugzilla/show_bug.cgi?id=51047 > https://dzone.com/articles/log4j-thread-deadlock-case > > However, seems like the fixes were made for log4j 2.X. And Solr uses log4j > 1.2.X (even the new Solr 5.3.0, from what I've seen). > > Is this a known problem? > Is it possible to upgrade Solr log4j version to 2.X? We have an issue to upgrde log4j. I know because I'm the one that opened it. I haven't had any time to work on it, and until I can actually research it, I am fairly clueless about how to proceed. https://issues.apache.org/jira/browse/SOLR-7887 What container are you running in? The stacktrace was not complete enough for me to figure that out myself. What is that container's maxThreads setting? The thread name including "http-bio-8443" makes me thing it's probably Tomcat, not the jetty included in the example found in the download, which makes the maxThreads parameter particularly relevant. I do not see any mention of locks in the information that you included, either held or waiting. If a lot of threads are waiting on a single lock, then you should be able to find which thread is holding that lock ... and I don't think it will be the thread that you mentioned. Thanks, Shawn