Hi, I can confirm similar behaviour, but for solr 4.3.1. We use default values for merge related settings. Even though mergeFactor=10 by default, there are 13 segments in one core and 30 segments in another. I am not sure it proves there is a bug in the merging, because it depends on the TieredMergePolicy. Relevant discussion from the past: http://lucene.472066.n3.nabble.com/TieredMergePolicy-reclaimDeletesWeight-td4071487.html Apart from other policy parameters you could play with ReclaimDeletesWeight, in case you'd like to affect on merging the segments with deletes in them. See http://stackoverflow.com/questions/18361300/informations-about-tieredmergepolicy
Regarding your attachment: I believe it got cut by the mailing list system, could you share it via a file sharing system? On Sat, Mar 14, 2015 at 7:36 AM, Summer Shire <shiresum...@gmail.com> wrote: > Hi All, > > Did anyone get a chance to look at my config and the InfoStream File ? > > I am very curious to see what you think > > thanks, > Summer > > > On Mar 6, 2015, at 5:20 PM, Summer Shire <shiresum...@gmail.com> wrote: > > > > Hi All, > > > > Here’s more update on where I am at with this. > > I enabled infoStream logging and quickly figured that I need to get rid > of maxBufferedDocs. So Erick you > > were absolutely right on that. > > I increased my ramBufferSize to 100MB > > and reduced maxMergeAtOnce to 3 and segmentsPerTier to 3 as well. > > My config looks like this > > > > <indexConfig> > > <useCompoundFile>false</useCompoundFile> > > <ramBufferSizeMB>100</ramBufferSizeMB> > > > > > <!--<maxMergeSizeForForcedMerge>9223372036854775807</maxMergeSizeForForcedMerge>--> > > <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> > > <int name="maxMergeAtOnce">3</int> > > <int name="segmentsPerTier">3</int> > > </mergePolicy> > > <mergeScheduler > class="org.apache.lucene.index.ConcurrentMergeScheduler"/> > > <infoStream file=“/tmp/INFOSTREAM.txt”>true</infoStream> > > </indexConfig> > > > > I am attaching a sample infostream log file. > > In the infoStream logs though you an see how the segments keep on adding > > and it shows (just an example ) > > allowedSegmentCount=10 vs count=9 (eligible count=9) tooBigCount=0 > > > > I looked at TieredMergePolicy.java to see how allowedSegmentCount is > getting calculated > > // Compute max allowed segs in the index > > long levelSize = minSegmentBytes; > > long bytesLeft = totIndexBytes; > > double allowedSegCount = 0; > > while(true) { > > final double segCountLevel = bytesLeft / (double) levelSize; > > if (segCountLevel < segsPerTier) { > > allowedSegCount += Math.ceil(segCountLevel); > > break; > > } > > allowedSegCount += segsPerTier; > > bytesLeft -= segsPerTier * levelSize; > > levelSize *= maxMergeAtOnce; > > } > > int allowedSegCountInt = (int) allowedSegCount; > > and the minSegmentBytes is calculated as follows > > // Compute total index bytes & print details about the index > > long totIndexBytes = 0; > > long minSegmentBytes = Long.MAX_VALUE; > > for(SegmentInfoPerCommit info : infosSorted) { > > final long segBytes = size(info); > > if (verbose()) { > > String extra = merging.contains(info) ? " [merging]" : ""; > > if (segBytes >= maxMergedSegmentBytes/2.0) { > > extra += " [skip: too large]"; > > } else if (segBytes < floorSegmentBytes) { > > extra += " [floored]"; > > } > > message(" seg=" + writer.get().segString(info) + " size=" + > String.format(Locale.ROOT, "%.3f", segBytes/1024/1024.) + " MB" + extra); > > } > > > > minSegmentBytes = Math.min(segBytes, minSegmentBytes); > > // Accum total byte size > > totIndexBytes += segBytes; > > } > > > > > > any input is welcome. > > > > <myinfoLog.rtf> > > > > > > thanks, > > Summer > > > > > >> On Mar 5, 2015, at 8:11 AM, Erick Erickson <erickerick...@gmail.com> > wrote: > >> > >> I would, BTW, either just get rid of the <maxBufferedDocs> all together > or > >> make it much higher, i.e. 100000. I don't think this is really your > >> problem, but you're creating a lot of segments here. > >> > >> But I'm kind of at a loss as to what would be different about your > setup. > >> Is there _any_ chance that you have some secondary process looking at > >> your index that's maintaining open searchers? Any custom code that's > >> perhaps failing to close searchers? Is this a Unix or Windows system? > >> > >> And just to be really clear, you _only_ seeing more segments being > >> added, right? If you're only counting files in the index directory, it's > >> _possible_ that merging is happening, you're just seeing new files take > >> the place of old ones. > >> > >> Best, > >> Erick > >> > >> On Wed, Mar 4, 2015 at 7:12 PM, Shawn Heisey <apa...@elyograg.org> > wrote: > >>> On 3/4/2015 4:12 PM, Erick Erickson wrote: > >>>> I _think_, but don't know for sure, that the merging stuff doesn't get > >>>> triggered until you commit, it doesn't "just happen". > >>>> > >>>> Shot in the dark... > >>> > >>> I believe that new segments are created when the indexing buffer > >>> (ramBufferSizeMB) fills up, even without commits. I'm pretty sure that > >>> anytime a new segment is created, the merge policy is checked to see > >>> whether a merge is needed. > >>> > >>> Thanks, > >>> Shawn > >>> > > > > -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info