Yep. Here's Mike's classic video:
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

The third visualization down "TieredMergePolicy" is the default.

Best,
Erick

On Wed, Jan 13, 2016 at 6:52 PM, Zheng Lin Edwin Yeo
<edwinye...@gmail.com> wrote:
> Hi Erick,
>
> Thanks for your reply.
>
> So those small segments that I found is probably due to a commit happening
> during that time?
>
> I also found that those small segments are created during the last
> indexing. If I start another batch of indexing, those small segments will
> probably be get merge together to form a 10GB segment, as I have defined
> the maxMergeSegmentMB to be 10240MB. Then there will be other new small
> segments that are formed from the latest batch of indexing. Is that the way
> it works?
>
> Regards,
> Edwin
>
>
> On 14 January 2016 at 10:38, Erick Erickson <erickerick...@gmail.com> wrote:
>
>> ramBufferSizeMB is a _limit_ that flushes the buffer when
>> it is reached (actually, I think, it indexes a doc _then_
>> checks the size and if it's > the setting, flushes the
>> buffer. So technically you can exceed the buffer size by
>> your biggest doc's addition to the index).
>>
>> But I digress. This is a _limit_. If a commit happens (either
>> an autocommit or client-initiated commit or a commitWithin)
>> then the segment is flushed without regard to ramBufferSizeMB.
>>
>> Best,
>> Erick
>>
>> On Wed, Jan 13, 2016 at 5:44 PM, Zheng Lin Edwin Yeo
>> <edwinye...@gmail.com> wrote:
>> > Hi,
>> >
>> > I would like to check, if I have make the following settings for
>> > ramBufferSizeMB, and I am using TieredMergePolicy, am I supposed to get
>> > each segment size of at least 320MB?
>> >
>> >
>> >     <!-- ramBufferSizeMB sets the amount of RAM that may be used by
>> Lucene
>> >          indexing for buffering added documents and deletions before
>> they are
>> >          flushed to the Directory.
>> >          maxBufferedDocs sets a limit on the number of documents buffered
>> >          before flushing.
>> >          If both ramBufferSizeMB and maxBufferedDocs is set, then
>> >          Lucene will flush based on whichever limit is hit first.
>> >          The default is 100 MB.  -->
>> >         <ramBufferSizeMB>320</ramBufferSizeMB>
>> >         <!--<maxBufferedDocs>1000</maxBufferedDocs>-->
>> >
>> >
>> >         <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>> >           <int name="maxMergeAtOnce">10</int>
>> >           <int name="segmentsPerTier">10</int>
>> >           <double name="maxMergedSegmentMB">10240</double>
>> >         </mergePolicy>
>> >
>> >
>> > I have this setting in my solrconfig.xml, but when I checked my segments
>> > size under the Segments info screen on the Admin UI, I see quite a number
>> > of segments at the bottom which have size that are much smaller than
>> 320MB.
>> > Is that the correct behaviour, or is my ramBufferSizeMB not working
>> > correctly?
>> >
>> > I am using Solr 5.4.0,
>> >
>> >
>> > Regards,
>> > Edwin
>>

Reply via email to