'mergeFactor' should be 5 or 10, not 40k. This means Solr can open thousands of small files and this will not work well.
ramBufferSizeMB is 1G. The entire solr has 1G allocated, so there may be a lot of garbage collection. Try 50 to 100 megs for ramBufferSizeMB. 1G is a little small for doing large numbers of fulltext documents. On Wed, Jun 9, 2010 at 2:59 AM, Danyal Mark <mark.dan...@gmail.com> wrote: > > We have following solr configuration: > > java -Xms512M -Xmx1024M -Dsolr.solr.home=<solr home directory> -jar > start.jar > > in SolrConfig.xml > > <indexDefaults> > <useCompoundFile>false</useCompoundFile> > <mergeFactor>40000</mergeFactor> > <maxBufferedDocs>200000</maxBufferedDocs> > <ramBufferSizeMB>1024</ramBufferSizeMB> > <maxFieldLength>10000</maxFieldLength> > <writeLockTimeout>1000</writeLockTimeout> > <commitLockTimeout>10000</commitLockTimeout> > <lockType>native</lockType> > </indexDefaults> > > > <mainIndex> > <useCompoundFile>false</useCompoundFile> > <ramBufferSizeMB>1024</ramBufferSizeMB> > <mergeFactor>40000</mergeFactor> > <!-- Deprecated --> > <!--<maxBufferedDocs>10</maxBufferedDocs>--> > <!--<maxMergeDocs>2147483647</maxMergeDocs>--> > <unlockOnStartup>false</unlockOnStartup> > <reopenReaders>true</reopenReaders> > <deletionPolicy class="solr.SolrDeletionPolicy"> > <str name="maxCommitsToKeep">1</str> > <str name="maxOptimizedCommitsToKeep">0</str> > </deletionPolicy> > <infoStream file="INFOSTREAM.txt">false</infoStream> > </mainIndex> > > > Also, we have used autoCommit=false. We have our PC spec: > > Core2-Duo > 2GB RAM > Solr Server running in localhost > Index Directory is also in local FileSystem > Input Fulltext files using remoteStreaming from another PC > > > Here, when we indexed 100000 Fulltext documents, the total time taken is > 40mins. We want to optimize the time lesser to this. We have been studying > on UpdateRequestProcessorChain section > > <requestHandler name="/update" class="solr.XmlUpdateRequestHandler"> > <lst name="defaults"> > <str name="update.processor">dedupe</str> > </lst> > </requestHandler> > > How to use this UpdateRequestProcessorChain in /update/extract/ to run > indexing in multiple chains (i.e multiple threads). Can you suggest me if I > can optimize the process changing any of these configurations? > > with regards, > Danyal Mark > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Index-search-optimization-for-fulltext-remote-streaming-tp828274p881809.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Lance Norskog goks...@gmail.com