Do you have to make a new call to optimize to make it start the merge again?
-----Original Message----- From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] Sent: Monday, October 12, 2009 7:28 PM To: solr-user@lucene.apache.org Subject: Re: Lucene Merge Threads Try this in solrconfig.xml: <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"> <int name="maxThreadCount">1</int> </mergeScheduler> Yes you can stop the process mid-merge. The partially merged files will be deleted on restart. We need to update the wiki? On Mon, Oct 12, 2009 at 4:05 PM, Giovanni Fernandez-Kincade <gfernandez-kinc...@capitaliq.com> wrote: > Hi, > I'm attempting to optimize a pretty large index, and even though the optimize > request timed out, I watched it using a profiler and saw that the optimize > thread continued executing. Eventually it completed, but in the background I > still see a thread performing a merge: > > Lucene Merge Thread #0 [RUNNABLE, IN_NATIVE] CPU time: 17:51 > java.io.RandomAccessFile.readBytes(byte[], int, int) > java.io.RandomAccessFile.read(byte[], int, int) > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], > int, int) > org.apache.lucene.store.BufferedIndexInput.refill() > org.apache.lucene.store.BufferedIndexInput.readByte() > org.apache.lucene.store.IndexInput.readVInt() > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentMergeInfo.next() > org.apache.lucene.index.SegmentMerger.mergeTermInfos(FormatPostingsFieldsConsumer) > org.apache.lucene.index.SegmentMerger.mergeTerms() > org.apache.lucene.index.SegmentMerger.merge(boolean) > org.apache.lucene.index.IndexWriter.mergeMiddle(MergePolicy$OneMerge) > org.apache.lucene.index.IndexWriter.merge(MergePolicy$OneMerge) > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(MergePolicy$OneMerge) > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run() > > > This has taken quite a while, and hasn't really been fully utilizing the > machine's resources. After looking at the Lucene source, I noticed that you > can set a MaxThreadCount parameter in this class. Is this parameter exposed > by Solr somehow? I see the class mentioned, commented out, in my > solrconfig.xml, but I'm not sure of the correct way to specify the parameter: > > <!-- > Expert: > The Merge Scheduler in Lucene controls how merges are performed. The > ConcurrentMergeScheduler (Lucene 2.3 default) > can perform merges in the background using separate threads. The > SerialMergeScheduler (Lucene 2.2 default) does not. > --> > > <!--<mergeScheduler>org.apache.lucene.index.ConcurrentMergeScheduler</mergeScheduler>--> > > > Also, if I can specify this parameter, is it safe to just start/stop my > servlet server (Tomcat) mid-merge? > > Thanks in advance, > Gio. >