On Thu, Aug 14, 2008 at 2:01 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: > Chris Harris <[EMAIL PROTECTED]> wrote: >> It's my understanding that if my mergeFactor is 10, then there >> shouldn't be more than 11 segments in my index directory (10 segments, >> plus an additional segment if a merge is in progress). > > Actually, mergeFactor 10 means each *level* will have <= 10 segments, > where a level is roughly 10X the size of the previous level. > > EG after 10 segments (level 0) are flushed, they get merged into a > single level 1 segment. Another 10 produces another level 1 segment. > Etc. Until you have 10 level 1 segments, which then get merged into a > single level 2 segment. > > The number of levels you have is logarithmic in the size of your index.
Thanks, that undoes a lot of my confusion. As for segment creation, is it accurate to say the following: Solr will write a new level 0 segment to disk each time an additional ramBufferSizeMB (default=32MB) worth of data have been added to the index. Furthermore, once that 32MB worth of data has been written to disk, those segment's files will never be modified. (The only time a segment will be modified is if you delete files from it, and that will only alter the segment's .del file, leaving .tis and friends alone.) >> I just discovered that one of my other indexes has over 11,000 tis >> files. That's disturbing. I'm not sure if it would have the same >> underlying cause. > > That does NOT sound right. Can you provide more details how this > index is created/maintained? I don't know exactly what happened, but I restarted Solr once or twice and then when I started adding documents again, Solr started deleting segment files, and brought things down from like 500GB to like 18GB. I feel like I read somewhere that Solr sometimes has trouble deleting segment files when running on Windows. (I'm on Windows right now.) I wonder if that's related. The main thing that bugs me about this index now is that the latest version of Luke (0.8.1) won't open it. ("Unknown format version: -6") The Solr Luke handler works fine with it, though.