I'm fairly certain that all of the indexing jobs are calling SOLR with commit=false. They all construct the indexing URLs using a CLR function I wrote, which takes in a Commit parameter, which is always set to false.
Also, I don't see any calls to commit in the Tomcat logs (whereas normally when I make a commit call I do). This suggests that Solr is doing it automatically, but the extract handler doesn't seem to be the problem: <requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler" startup="lazy"> <lst name="defaults"> <str name="uprefix">ignored_</str> <str name="map.content">fileData</str> </lst> </requestHandler> There is no external config file specified, and I don't see anything about commits here. I've tried setting up more detailed indexer logging but haven't been able to get it to work: <infoStream file="c:\solr\indexer.log">true</infoStream> I tried relative and absolute paths, but no dice so far. Any other ideas? -Gio. -----Original Message----- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Monday, October 05, 2009 12:52 PM To: solr-user@lucene.apache.org Subject: Re: Solr Timeouts > This is what one of my SOLR requests look like: > > http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false Have you verified that all of your indexing jobs (you said you had 4 or 5) have commit=false? Also make sure that your extract handler doesn't have a default of something that could cause a commit - like commitWithin or something. -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 12:44 PM, Giovanni Fernandez-Kincade <gfernandez-kinc...@capitaliq.com> wrote: > Is there somewhere other than solrConfig.xml that the autoCommit feature is > enabled? I've looked through that file and found autocommit to be commented > out: > > > > <!-- > > Perform a <commit/> automatically under certain conditions: > > maxDocs - number of updates since last commit is greater than this > > maxTime - oldest uncommited update (in ms) is this long ago > > <autoCommit> > > <maxDocs>10000</maxDocs> > > <maxTime>1000</maxTime> > > </autoCommit> > > > > > > --> > > > > > > > -----Original Message----- > From: Feak, Todd [mailto:todd.f...@smss.sony.com] > Sent: Monday, October 05, 2009 12:40 PM > To: solr-user@lucene.apache.org > Subject: RE: Solr Timeouts > > > > Actually, ignore my other response. > > > > I believe you are committing, whether you know it or not. > > > > This is in your provided stack trace > > org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, > SolrParams, boolean) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, > SolrQueryResponse) > > > > I think Yonik gave you additional information for how to make it faster. > > > > -Todd > > > > -----Original Message----- > > From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] > > Sent: Monday, October 05, 2009 9:30 AM > > To: solr-user@lucene.apache.org > > Subject: RE: Solr Timeouts > > > > I'm not committing at all actually - I'm waiting for all 6 million to be done. > > > > -----Original Message----- > > From: Feak, Todd [mailto:todd.f...@smss.sony.com] > > Sent: Monday, October 05, 2009 12:10 PM > > To: solr-user@lucene.apache.org > > Subject: RE: Solr Timeouts > > > > How often are you committing? > > > > Every time you commit, Solr will close the old index and open the new one. If > you are doing this in parallel from multiple jobs (4-5 you mention) then > eventually the server gets behind and you start to pile up commit requests. > Once this starts to happen, it will cascade out of control if the rate of > commits isn't slowed. > > > > -Todd > > > > ________________________________ > > From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] > > Sent: Monday, October 05, 2009 9:04 AM > > To: solr-user@lucene.apache.org > > Subject: Solr Timeouts > > > > Hi, > > I'm attempting to index approximately 6 million HTML/Text files using SOLR > 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. > I've fired up 4-5 different jobs that are making indexing requests using the > ExtractionRequestHandler, and everything works well for about 30-40 minutes, > after which all indexing requests start timing out. I profiled the server and > found that all of the threads are getting blocked by this call to flush the > Lucene index to disk (see below). > > > > This leads me to a few questions: > > > > 1. Is this normal? > > > > 2. Can I reduce the frequency with which this happens somehow? I've > greatly increased the indexing options in SolrConfig.xml (attached here) to > no avail. > > > > 3. During these flushes, resource utilization (CPU, I/O, Memory > Consumption) is significantly down compared to when requests are being > handled. Is there any way to make this index go faster? I have plenty of > bandwidth on the machine. > > > > I appreciate any insight you can provide. We're currently using MS SQL 2005 > as our full-text solution and are pretty much miserable. So far SOLR has been > a great experience. > > > > Thanks, > > Gio. > > > > http-8080-Processor21 [RUNNABLE] CPU time: 9:51 > > java.io.RandomAccessFile.seek(long) > > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], > int, int) > > org.apache.lucene.store.BufferedIndexInput.refill() > > org.apache.lucene.store.BufferedIndexInput.readByte() > > org.apache.lucene.store.IndexInput.readVInt() > > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > > org.apache.lucene.index.SegmentTermEnum.next() > > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > > org.apache.lucene.index.TermInfosReader.get(Term) > > org.apache.lucene.index.SegmentTermDocs.seek(Term) > > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > > org.apache.lucene.index.IndexWriter.applyDeletes() > > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > > org.apache.lucene.index.IndexWriter.closeInternal(boolean) > > org.apache.lucene.index.IndexWriter.close(boolean) > > org.apache.lucene.index.IndexWriter.close() > > org.apache.solr.update.SolrIndexWriter.close() > > org.apache.solr.update.DirectUpdateHandler2.closeWriter() > > org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) > > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) > > org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, > SolrParams, boolean) > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, > SolrQueryResponse) > > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, > SolrQueryResponse) > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, > SolrQueryResponse) > > org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, > SolrQueryResponse) > > org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, > SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, > ServletResponse, FilterChain) > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, > ServletResponse) > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, > ServletResponse) > > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) > > org.apache.catalina.core.StandardContextValve.invoke(Request, Response) > > org.apache.catalina.core.StandardHostValve.invoke(Request, Response) > > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) > > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) > > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) > > org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) > > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, > Object[]) > > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, > TcpConnection, Object[]) > > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) > > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() > > java.lang.Thread.run() > > > > > > >