On Mon, Oct 5, 2009 at 12:30 PM, Giovanni Fernandez-Kincade <gfernandez-kinc...@capitaliq.com> wrote: > I'm not committing at all actually - I'm waiting for all 6 million to be done.
You either have solr auto commit set up, or a client is issuing a commit. -Yonik http://www.lucidimagination.com > -----Original Message----- > From: Feak, Todd [mailto:todd.f...@smss.sony.com] > Sent: Monday, October 05, 2009 12:10 PM > To: solr-user@lucene.apache.org > Subject: RE: Solr Timeouts > > How often are you committing? > > Every time you commit, Solr will close the old index and open the new one. If > you are doing this in parallel from multiple jobs (4-5 you mention) then > eventually the server gets behind and you start to pile up commit requests. > Once this starts to happen, it will cascade out of control if the rate of > commits isn't slowed. > > -Todd > > ________________________________ > From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] > Sent: Monday, October 05, 2009 9:04 AM > To: solr-user@lucene.apache.org > Subject: Solr Timeouts > > Hi, > I'm attempting to index approximately 6 million HTML/Text files using SOLR > 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. > I've fired up 4-5 different jobs that are making indexing requests using the > ExtractionRequestHandler, and everything works well for about 30-40 minutes, > after which all indexing requests start timing out. I profiled the server and > found that all of the threads are getting blocked by this call to flush the > Lucene index to disk (see below). > > This leads me to a few questions: > > 1. Is this normal? > > 2. Can I reduce the frequency with which this happens somehow? I've > greatly increased the indexing options in SolrConfig.xml (attached here) to > no avail. > > 3. During these flushes, resource utilization (CPU, I/O, Memory > Consumption) is significantly down compared to when requests are being > handled. Is there any way to make this index go faster? I have plenty of > bandwidth on the machine. > > I appreciate any insight you can provide. We're currently using MS SQL 2005 > as our full-text solution and are pretty much miserable. So far SOLR has been > a great experience. > > Thanks, > Gio. > > http-8080-Processor21 [RUNNABLE] CPU time: 9:51 > java.io.RandomAccessFile.seek(long) > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], > int, int) > org.apache.lucene.store.BufferedIndexInput.refill() > org.apache.lucene.store.BufferedIndexInput.readByte() > org.apache.lucene.store.IndexInput.readVInt() > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > org.apache.lucene.index.TermInfosReader.get(Term) > org.apache.lucene.index.SegmentTermDocs.seek(Term) > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > org.apache.lucene.index.IndexWriter.applyDeletes() > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > org.apache.lucene.index.IndexWriter.closeInternal(boolean) > org.apache.lucene.index.IndexWriter.close(boolean) > org.apache.lucene.index.IndexWriter.close() > org.apache.solr.update.SolrIndexWriter.close() > org.apache.solr.update.DirectUpdateHandler2.closeWriter() > org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) > org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, > SolrParams, boolean) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, > SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, > ServletResponse, FilterChain) > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, > ServletResponse) > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, > ServletResponse) > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) > org.apache.catalina.core.StandardContextValve.invoke(Request, Response) > org.apache.catalina.core.StandardHostValve.invoke(Request, Response) > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) > org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, > Object[]) > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, > TcpConnection, Object[]) > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() > java.lang.Thread.run() > >