This specific thread was blocked for an hour? If so, I'd echo Lance... this is a local disk right?
-Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 2:11 PM, Giovanni Fernandez-Kincade <gfernandez-kinc...@capitaliq.com> wrote: > I just grabbed another stack trace for a thread that has been similarly > blocking for over an hour. Notice that there is no Commit in this one: > > http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > org.apache.lucene.index.TermInfosReader.get(Term) > org.apache.lucene.index.SegmentTermDocs.seek(Term) > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > org.apache.lucene.index.IndexWriter.applyDeletes() > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) > org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, > AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, > SolrQueryResponse, ContentStream) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, > SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, > ServletResponse, FilterChain) > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, > ServletResponse) > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, > ServletResponse) > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) > org.apache.catalina.core.StandardContextValve.invoke(Request, Response) > org.apache.catalina.core.StandardHostValve.invoke(Request, Response) > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) > org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, > Object[]) > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, > TcpConnection, Object[]) > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() > java.lang.Thread.run() > > > -----Original Message----- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley > Sent: Monday, October 05, 2009 1:18 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr Timeouts > > OK... next step is to verify that SolrCell doesn't have a bug that > causes it to commit. > I'll try and verify today unless someone else beats me to it. > > -Yonik > http://www.lucidimagination.com > > On Mon, Oct 5, 2009 at 1:04 PM, Giovanni Fernandez-Kincade > <gfernandez-kinc...@capitaliq.com> wrote: >> I'm fairly certain that all of the indexing jobs are calling SOLR with >> commit=false. They all construct the indexing URLs using a CLR function I >> wrote, which takes in a Commit parameter, which is always set to false. >> >> Also, I don't see any calls to commit in the Tomcat logs (whereas normally >> when I make a commit call I do). >> >> This suggests that Solr is doing it automatically, but the extract handler >> doesn't seem to be the problem: >> <requestHandler name="/update/extract" >> class="org.apache.solr.handler.extraction.ExtractingRequestHandler" >> startup="lazy"> >> <lst name="defaults"> >> <str name="uprefix">ignored_</str> >> <str name="map.content">fileData</str> >> </lst> >> </requestHandler> >> >> >> There is no external config file specified, and I don't see anything about >> commits here. >> >> I've tried setting up more detailed indexer logging but haven't been able to >> get it to work: >> <infoStream file="c:\solr\indexer.log">true</infoStream> >> >> I tried relative and absolute paths, but no dice so far. >> >> Any other ideas? >> >> -Gio. >> >> -----Original Message----- >> From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley >> Sent: Monday, October 05, 2009 12:52 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Solr Timeouts >> >>> This is what one of my SOLR requests look like: >>> >>> http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false >> >> Have you verified that all of your indexing jobs (you said you had 4 >> or 5) have commit=false? >> >> Also make sure that your extract handler doesn't have a default of >> something that could cause a commit - like commitWithin or something. >> >> -Yonik >> http://www.lucidimagination.com >> >> >> >> On Mon, Oct 5, 2009 at 12:44 PM, Giovanni Fernandez-Kincade >> <gfernandez-kinc...@capitaliq.com> wrote: >>> Is there somewhere other than solrConfig.xml that the autoCommit feature is >>> enabled? I've looked through that file and found autocommit to be commented >>> out: >>> >>> >>> >>> <!-- >>> >>> Perform a <commit/> automatically under certain conditions: >>> >>> maxDocs - number of updates since last commit is greater than this >>> >>> maxTime - oldest uncommited update (in ms) is this long ago >>> >>> <autoCommit> >>> >>> <maxDocs>10000</maxDocs> >>> >>> <maxTime>1000</maxTime> >>> >>> </autoCommit> >>> >>> >>> >>> >>> >>> --> >>> >>> >>> >> >>> >>> >>> >>> -----Original Message----- >>> From: Feak, Todd [mailto:todd.f...@smss.sony.com] >>> Sent: Monday, October 05, 2009 12:40 PM >>> To: solr-user@lucene.apache.org >>> Subject: RE: Solr Timeouts >>> >>> >>> >>> Actually, ignore my other response. >>> >>> >>> >>> I believe you are committing, whether you know it or not. >>> >>> >>> >>> This is in your provided stack trace >>> >>> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, >>> SolrParams, boolean) >>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, >>> SolrQueryResponse) >>> >>> >>> >>> I think Yonik gave you additional information for how to make it faster. >>> >>> >>> >>> -Todd >>> >>> >>> >>> -----Original Message----- >>> >>> From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] >>> >>> Sent: Monday, October 05, 2009 9:30 AM >>> >>> To: solr-user@lucene.apache.org >>> >>> Subject: RE: Solr Timeouts >>> >>> >>> >>> I'm not committing at all actually - I'm waiting for all 6 million to be >>> done. >>> >>> >>> >>> -----Original Message----- >>> >>> From: Feak, Todd [mailto:todd.f...@smss.sony.com] >>> >>> Sent: Monday, October 05, 2009 12:10 PM >>> >>> To: solr-user@lucene.apache.org >>> >>> Subject: RE: Solr Timeouts >>> >>> >>> >>> How often are you committing? >>> >>> >>> >>> Every time you commit, Solr will close the old index and open the new one. >>> If you are doing this in parallel from multiple jobs (4-5 you mention) then >>> eventually the server gets behind and you start to pile up commit requests. >>> Once this starts to happen, it will cascade out of control if the rate of >>> commits isn't slowed. >>> >>> >>> >>> -Todd >>> >>> >>> >>> ________________________________ >>> >>> From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] >>> >>> Sent: Monday, October 05, 2009 9:04 AM >>> >>> To: solr-user@lucene.apache.org >>> >>> Subject: Solr Timeouts >>> >>> >>> >>> Hi, >>> >>> I'm attempting to index approximately 6 million HTML/Text files using SOLR >>> 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. >>> I've fired up 4-5 different jobs that are making indexing requests using >>> the ExtractionRequestHandler, and everything works well for about 30-40 >>> minutes, after which all indexing requests start timing out. I profiled the >>> server and found that all of the threads are getting blocked by this call >>> to flush the Lucene index to disk (see below). >>> >>> >>> >>> This leads me to a few questions: >>> >>> >>> >>> 1. Is this normal? >>> >>> >>> >>> 2. Can I reduce the frequency with which this happens somehow? I've >>> greatly increased the indexing options in SolrConfig.xml (attached here) to >>> no avail. >>> >>> >>> >>> 3. During these flushes, resource utilization (CPU, I/O, Memory >>> Consumption) is significantly down compared to when requests are being >>> handled. Is there any way to make this index go faster? I have plenty of >>> bandwidth on the machine. >>> >>> >>> >>> I appreciate any insight you can provide. We're currently using MS SQL 2005 >>> as our full-text solution and are pretty much miserable. So far SOLR has >>> been a great experience. >>> >>> >>> >>> Thanks, >>> >>> Gio. >>> >>> >>> >>> http-8080-Processor21 [RUNNABLE] CPU time: 9:51 >>> >>> java.io.RandomAccessFile.seek(long) >>> >>> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], >>> int, int) >>> >>> org.apache.lucene.store.BufferedIndexInput.refill() >>> >>> org.apache.lucene.store.BufferedIndexInput.readByte() >>> >>> org.apache.lucene.store.IndexInput.readVInt() >>> >>> org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) >>> >>> org.apache.lucene.index.SegmentTermEnum.next() >>> >>> org.apache.lucene.index.SegmentTermEnum.scanTo(Term) >>> >>> org.apache.lucene.index.TermInfosReader.get(Term, boolean) >>> >>> org.apache.lucene.index.TermInfosReader.get(Term) >>> >>> org.apache.lucene.index.SegmentTermDocs.seek(Term) >>> >>> org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) >>> >>> org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) >>> >>> org.apache.lucene.index.IndexWriter.applyDeletes() >>> >>> org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) >>> >>> org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) >>> >>> org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) >>> >>> org.apache.lucene.index.IndexWriter.closeInternal(boolean) >>> >>> org.apache.lucene.index.IndexWriter.close(boolean) >>> >>> org.apache.lucene.index.IndexWriter.close() >>> >>> org.apache.solr.update.SolrIndexWriter.close() >>> >>> org.apache.solr.update.DirectUpdateHandler2.closeWriter() >>> >>> org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) >>> >>> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) >>> >>> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, >>> SolrParams, boolean) >>> >>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, >>> SolrQueryResponse) >>> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, >>> SolrQueryResponse) >>> >>> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, >>> SolrQueryResponse) >>> >>> org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, >>> SolrQueryResponse) >>> >>> org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, >>> SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) >>> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, >>> ServletResponse, FilterChain) >>> >>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, >>> ServletResponse) >>> >>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, >>> ServletResponse) >>> >>> org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) >>> >>> org.apache.catalina.core.StandardContextValve.invoke(Request, Response) >>> >>> org.apache.catalina.core.StandardHostValve.invoke(Request, Response) >>> >>> org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) >>> >>> org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) >>> >>> org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) >>> >>> org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) >>> >>> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, >>> Object[]) >>> >>> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, >>> TcpConnection, Object[]) >>> >>> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) >>> >>> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() >>> >>> java.lang.Thread.run() >>> >>> >>> >>> >>> >>> >>> >> >