Ok, thanks. I'm studying the RAM buffer/MergePolicy nexus as we speak.

I hereby name the function "minimum number of coins and bills needed
to represent a number" as its "change log".

On Tue, Apr 6, 2010 at 2:08 AM, Michael McCandless
<luc...@mikemccandless.com> wrote:
> Actually this isn't quite right.
>
> Lucene flushes a new segment whenever RAM is full (not every 5 docs if
> mergeFactor is 5).
>
> Whereas mergeFactor decides how many segments of roughly the same size
> are merged at once.
>
> So eg if you index 42 docs, unless the docs are immense (or, are not
> indexed in a single session), that will create 1 segment.
>
> Mike
>
> On Mon, Apr 5, 2010 at 6:21 PM, Lance Norskog <goks...@gmail.com> wrote:
>> mergeFactor=5 means that if there are 42 documents, there will be 3 index 
>> files:
>>
>> 1 with 25 documents,
>> 3 with 5 documents, and
>> 1 with 2 documents
>>
>> Imagine making change with coins of 1 document, 5 documents, 5^2
>> documents, 5^3 documents, etc.
>>
>> On Mon, Apr 5, 2010 at 10:59 AM, Chris Hostetter
>> <hossman_luc...@fucit.org> wrote:
>>>
>>> This sounds completley normal form what i remembe about mergeFactor.
>>>
>>> Segmenets are merged "by level" meaning that with a mergeFactor of 5, once
>>> 5 "level 1" segments are formed they are merged into a single "level 2"
>>> segment.  then 5 more "level 1" segments are allowed to form before the
>>> next merge (resulting in 2 "legel 2" sements).  Once you have 5 "level 2"
>>> sements, then they are all merged into a single "level 3" segment, etc...
>>>
>>> : I had my mergeFactor as 5 ,
>>> : but when i load a data with some 1,00,000 i got some 12 .cfs files in my
>>> : data/index folder .
>>> :
>>> : How come this is possible .
>>> : in what context we can have more no of .cfs files
>>>
>>>
>>> -Hoss
>>>
>>>
>>
>>
>>
>> --
>> Lance Norskog
>> goks...@gmail.com
>>
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to