Yep. Here's Mike's classic video: http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
The third visualization down "TieredMergePolicy" is the default. Best, Erick On Wed, Jan 13, 2016 at 6:52 PM, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote: > Hi Erick, > > Thanks for your reply. > > So those small segments that I found is probably due to a commit happening > during that time? > > I also found that those small segments are created during the last > indexing. If I start another batch of indexing, those small segments will > probably be get merge together to form a 10GB segment, as I have defined > the maxMergeSegmentMB to be 10240MB. Then there will be other new small > segments that are formed from the latest batch of indexing. Is that the way > it works? > > Regards, > Edwin > > > On 14 January 2016 at 10:38, Erick Erickson <erickerick...@gmail.com> wrote: > >> ramBufferSizeMB is a _limit_ that flushes the buffer when >> it is reached (actually, I think, it indexes a doc _then_ >> checks the size and if it's > the setting, flushes the >> buffer. So technically you can exceed the buffer size by >> your biggest doc's addition to the index). >> >> But I digress. This is a _limit_. If a commit happens (either >> an autocommit or client-initiated commit or a commitWithin) >> then the segment is flushed without regard to ramBufferSizeMB. >> >> Best, >> Erick >> >> On Wed, Jan 13, 2016 at 5:44 PM, Zheng Lin Edwin Yeo >> <edwinye...@gmail.com> wrote: >> > Hi, >> > >> > I would like to check, if I have make the following settings for >> > ramBufferSizeMB, and I am using TieredMergePolicy, am I supposed to get >> > each segment size of at least 320MB? >> > >> > >> > <!-- ramBufferSizeMB sets the amount of RAM that may be used by >> Lucene >> > indexing for buffering added documents and deletions before >> they are >> > flushed to the Directory. >> > maxBufferedDocs sets a limit on the number of documents buffered >> > before flushing. >> > If both ramBufferSizeMB and maxBufferedDocs is set, then >> > Lucene will flush based on whichever limit is hit first. >> > The default is 100 MB. --> >> > <ramBufferSizeMB>320</ramBufferSizeMB> >> > <!--<maxBufferedDocs>1000</maxBufferedDocs>--> >> > >> > >> > <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> >> > <int name="maxMergeAtOnce">10</int> >> > <int name="segmentsPerTier">10</int> >> > <double name="maxMergedSegmentMB">10240</double> >> > </mergePolicy> >> > >> > >> > I have this setting in my solrconfig.xml, but when I checked my segments >> > size under the Segments info screen on the Admin UI, I see quite a number >> > of segments at the bottom which have size that are much smaller than >> 320MB. >> > Is that the correct behaviour, or is my ramBufferSizeMB not working >> > correctly? >> > >> > I am using Solr 5.4.0, >> > >> > >> > Regards, >> > Edwin >>