Chris, It looks like Mike already offered several solutions.... though I don't know what Solr does without looking at the code.
But I'm curious: * how big is your index? and do you know how large the segments being merged are? * do you batch docs or do you make use of Streaming SolrServer? I'm curious, because I've never encountered this problem before... Thanks, Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Chris Harris <rygu...@gmail.com> > To: solr-user@lucene.apache.org > Sent: Thu, April 22, 2010 6:28:29 PM > Subject: Best way to prevent this search lockup (apparently caused during big > segment merges)? > > I'm running Solr 1.4+ under Tomcat 6, with indexing and searching requests > simultaneously hitting the same Solr machine. Sometimes Solr, Tomcat, and my > (C#) indexing process conspire to render search inoperable. So far I've only > noticed this while big segment merges (i.e. merges that take multiple > minutes) are taking place. Let me explain the situation as best as I > understand it. My indexer has a main loop that looks roughly like > this: while true: try: > submit a new add or delete request to Solr via HTTP catch > timeoutException: sleep a few seconds When things > are going wrong (i.e., when a large segment merge is happening), this loop is > problematic: * When the indexer's request hits Solr, then the > corresponding thread in Tomcat blocks. (It looks to me like the thread is > destined to block until the entire merge is complete. I'll paste in what the > Java stack traces look like at the end of the message if they can help > diagnose things.) * Because the Solr thread stays blocked for so long, > eventually the indexer hits a timeoutException. (That is, it gives up on > Solr.) * Hitting the timeout exception doesn't cause the corresponding > Tomcat thread to die or unblock. Therefore, each time through the > loop, another Solr-handling thread inside Tomcat enters a blocked state. * > Eventually so many threads (maxThreads, whose Tomcat default is 200) are > blocked that Tomcat starts rejecting all new Solr HTTP requests -- including > those coming in from the web tier. * Users are unable to search. The problem > might self-correct once the merge is complete, but that could be quite a > while. What are my options for changing Solr settings or changing my > indexing process to avoid this lockup scenario? Do you agree that the > segment merge is helping cause the lockup? Do adds and deletes really need > to block on segment merges? Partial thread dumps follow, showing > example add and delete threads that are blocked. Also the active Lucene Merge > Thread, and the thread that kicked off the merge. [doc deletion > thread, waiting for DirectUpdateHandler2.iwCommit.lock() to > return] "http-1800-200" daemon prio=6 tid=0x000000000a58cc00 > nid=0x1028 waiting on condition > [0x000000000f9ae000..0x000000000f9afa90] java.lang.Thread.State: > WAITING (parking) at sun.misc.Unsafe.park(Native > Method) - parking to wait for > <0x000000016d801ae0> > (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(Unknown > Source) at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(Unknown Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(Unknown Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(Unknown Source) > at > org.apache.solr.update.DirectUpdateHandler2.deleteByQuery(DirectUpdateHandler2.java:320) > at > org.apache.solr.update.processor.RunUpdateProcessor.processDelete(RunUpdateProcessorFactory.java:71) > at > org.apache.solr.handler.XMLLoader.processDelete(XMLLoader.java:234) > at > org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:180) > at > org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) > at java.lang.Thread.run(Unknown Source) [doc adding thread, waiting for > DirectUpdateHandler2.iwAccess.lock() to return] "http-1800-70" daemon prio=6 > tid=0x0000000007946400 nid=0x590 waiting on condition > [0x000000000d76e000..0x000000000d76f910] java.lang.Thread.State: > WAITING (parking) at sun.misc.Unsafe.park(Native > Method) - parking to wait for > <0x000000016d801ae0> > (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(Unknown > Source) at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(Unknown Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(Unknown Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(Unknown Source) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(Unknown Source) > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:211) > at > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61) > at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:118) > at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:123) > at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:192) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) > at java.lang.Thread.run(Unknown Source) All 200 front-line Solr Tomcat > threads, from http-1800-1 through http-1800-200, are blocked in one of these > two ways. [The thread that kicked off the commit/merge. Note > that DirectUpdateHandler2.commit() calls iwCommit.lock(), and seems not > to release the iwCommit lock until DIH2.commit() > returns.] "pool-15-thread-1" prio=6 tid=0x000000000abc3c00 nid=0x1094 > in Object.wait() [0x000000000b6ff000..0x000000000b6ff910] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) - waiting on > <0x00000001b1ba84a0> (a > org.apache.solr.update.SolrIndexWriter) at > org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:5351) > - locked <0x00000001b1ba84a0> (a > org.apache.solr.update.SolrIndexWriter) at > org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:3479) > - locked <0x00000001b1ba84a0> (a > org.apache.solr.update.SolrIndexWriter) at > org.apache.lucene.index.IndexWriter.finishMerges(IndexWriter.java:3463) > - locked <0x00000001b1ba84a0> (a > org.apache.solr.update.SolrIndexWriter) at > org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:2200) > at > org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2153) > at > org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2117) > at > org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:230) > at > org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:181) > at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:409) > at > org.apache.solr.update.DirectUpdateHandler2$CommitTracker.run(DirectUpdateHandler2.java:602) > - locked <0x000000016d801760> > (a org.apache.solr.update.DirectUpdateHandler2$CommitTracker) > at java.util.concurrent.Executors$RunnableAdapter.call(Unknown > Source) at > java.util.concurrent.FutureTask$Sync.innerRun(Unknown > Source) at java.util.concurrent.FutureTask.run(Unknown > Source) at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown > Source) at > java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) at java.lang.Thread.run(Unknown > Source) [The only active Lucene merge thread] "Lucene Merge Thread #0" > daemon prio=6 tid=0x000000000930ac00 nid=0xb38 runnable > [0x000000000a41e000..0x000000000a41f790] java.lang.Thread.State: > RUNNABLE at java.io.RandomAccessFile.readBytes(Native > Method) at java.io.RandomAccessFile.read(Unknown > Source) at > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(SimpleFSDirectory.java:132) > - locked <0x00000001b443e588> > (a org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor) > at > org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157) > at > org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38) > at > org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80) > at > org.apache.lucene.index.SegmentTermPositions.readDeltaPosition(SegmentTermPositions.java:73) > at > org.apache.lucene.index.SegmentTermPositions.nextPosition(SegmentTermPositions.java:69) > at > org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:707) > at > org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:648) > at > org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:586) > at > org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:154) > at > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5029) > at > org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4614) > at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235) > at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)