Ok. Guess that isn't a problem. :)

A second consideration... I could see lock contention being an issue with 
multiple clients indexing at once. Is there any disadvantage to serializing the 
clients to remove lock contention?

-Todd

-----Original Message-----
From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] 
Sent: Monday, October 05, 2009 9:30 AM
To: solr-user@lucene.apache.org
Subject: RE: Solr Timeouts

I'm not committing at all actually - I'm waiting for all 6 million to be done. 

-----Original Message-----
From: Feak, Todd [mailto:todd.f...@smss.sony.com] 
Sent: Monday, October 05, 2009 12:10 PM
To: solr-user@lucene.apache.org
Subject: RE: Solr Timeouts

How often are you committing?

Every time you commit, Solr will close the old index and open the new one. If 
you are doing this in parallel from multiple jobs (4-5 you mention) then 
eventually the server gets behind and you start to pile up commit requests. 
Once this starts to happen, it will cascade out of control if the rate of 
commits isn't slowed.

-Todd

________________________________
From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com]
Sent: Monday, October 05, 2009 9:04 AM
To: solr-user@lucene.apache.org
Subject: Solr Timeouts

Hi,
I'm attempting to index approximately 6 million HTML/Text files using SOLR 
1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've 
fired up 4-5 different jobs that are making indexing requests using the 
ExtractionRequestHandler, and everything works well for about 30-40 minutes, 
after which all indexing requests start timing out. I profiled the server and 
found that all of the threads are getting blocked by this call to flush the 
Lucene index to disk (see below).

This leads me to a few questions:

1.       Is this normal?

2.       Can I reduce the frequency with which this happens somehow? I've 
greatly increased the indexing options in SolrConfig.xml (attached here) to no 
avail.

3.       During these flushes, resource utilization (CPU, I/O, Memory 
Consumption) is significantly down compared to when requests are being handled. 
Is there any way to make this index go faster? I have plenty of bandwidth on 
the machine.

I appreciate any insight you can provide. We're currently using MS SQL 2005 as 
our full-text solution and are pretty much miserable. So far SOLR has been a 
great experience.

Thanks,
Gio.

http-8080-Processor21 [RUNNABLE] CPU time: 9:51
java.io.RandomAccessFile.seek(long)
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[],
 int, int)
org.apache.lucene.store.BufferedIndexInput.refill()
org.apache.lucene.store.BufferedIndexInput.readByte()
org.apache.lucene.store.IndexInput.readVInt()
org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos)
org.apache.lucene.index.SegmentTermEnum.next()
org.apache.lucene.index.SegmentTermEnum.scanTo(Term)
org.apache.lucene.index.TermInfosReader.get(Term, boolean)
org.apache.lucene.index.TermInfosReader.get(Term)
org.apache.lucene.index.SegmentTermDocs.seek(Term)
org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int)
org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos)
org.apache.lucene.index.IndexWriter.applyDeletes()
org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean)
org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean)
org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean)
org.apache.lucene.index.IndexWriter.closeInternal(boolean)
org.apache.lucene.index.IndexWriter.close(boolean)
org.apache.lucene.index.IndexWriter.close()
org.apache.solr.update.SolrIndexWriter.close()
org.apache.solr.update.DirectUpdateHandler2.closeWriter()
org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand)
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand)
org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor,
 SolrParams, boolean)
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest,
 SolrQueryResponse)
org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, 
SolrQueryResponse)
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest,
 SolrQueryResponse)
org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, 
SolrQueryResponse)
org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, 
SolrRequestHandler, SolrQueryRequest, SolrQueryResponse)
org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, 
ServletResponse, FilterChain)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
 ServletResponse)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, 
ServletResponse)
org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response)
org.apache.catalina.core.StandardContextValve.invoke(Request, Response)
org.apache.catalina.core.StandardHostValve.invoke(Request, Response)
org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response)
org.apache.catalina.core.StandardEngineValve.invoke(Request, Response)
org.apache.catalina.connector.CoyoteAdapter.service(Request, Response)
org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream)
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection,
 Object[])
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, 
Object[])
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[])
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run()
java.lang.Thread.run()



Reply via email to