Ok, thanks. I'm studying the RAM buffer/MergePolicy nexus as we speak. I hereby name the function "minimum number of coins and bills needed to represent a number" as its "change log".
On Tue, Apr 6, 2010 at 2:08 AM, Michael McCandless <luc...@mikemccandless.com> wrote: > Actually this isn't quite right. > > Lucene flushes a new segment whenever RAM is full (not every 5 docs if > mergeFactor is 5). > > Whereas mergeFactor decides how many segments of roughly the same size > are merged at once. > > So eg if you index 42 docs, unless the docs are immense (or, are not > indexed in a single session), that will create 1 segment. > > Mike > > On Mon, Apr 5, 2010 at 6:21 PM, Lance Norskog <goks...@gmail.com> wrote: >> mergeFactor=5 means that if there are 42 documents, there will be 3 index >> files: >> >> 1 with 25 documents, >> 3 with 5 documents, and >> 1 with 2 documents >> >> Imagine making change with coins of 1 document, 5 documents, 5^2 >> documents, 5^3 documents, etc. >> >> On Mon, Apr 5, 2010 at 10:59 AM, Chris Hostetter >> <hossman_luc...@fucit.org> wrote: >>> >>> This sounds completley normal form what i remembe about mergeFactor. >>> >>> Segmenets are merged "by level" meaning that with a mergeFactor of 5, once >>> 5 "level 1" segments are formed they are merged into a single "level 2" >>> segment. then 5 more "level 1" segments are allowed to form before the >>> next merge (resulting in 2 "legel 2" sements). Once you have 5 "level 2" >>> sements, then they are all merged into a single "level 3" segment, etc... >>> >>> : I had my mergeFactor as 5 , >>> : but when i load a data with some 1,00,000 i got some 12 .cfs files in my >>> : data/index folder . >>> : >>> : How come this is possible . >>> : in what context we can have more no of .cfs files >>> >>> >>> -Hoss >>> >>> >> >> >> >> -- >> Lance Norskog >> goks...@gmail.com >> > -- Lance Norskog goks...@gmail.com