Which Solr version do you have? In 3.x and trunk, Tiered and BalancedSegment are there for exactly this reason.
In Solr 1.4, your only trick is to do a partial optimize with maxSegments. This lets you say "optimize until there are 15 segments, then stop". Do this with smaller and smaller numbers. On Wed, Aug 24, 2011 at 8:35 PM, Michael Ryan <mr...@moreover.com> wrote: > I'm using Solr 3.2 with a mergeFactor of 10 and no merge policy configured, > thus using the default LogByteSizeMergePolicy. Before I do an optimize, > typically the largest segment will be about 90% of the total index size. > > When I do an optimize, the total disk space required is usually about 2x the > index size. But about 10% of the time, the disk space required is about 3x > the index size - when this happens, I see a very large segment created, > roughly the size of the original index size, followed by another slightly > larger segment. > > After some investigating, I found that this would happen when there were > exactly 20 segments in the index when the optimize started. My hypothesis is > that this is a side-effect of the 20 segments being evenly divisible by the > mergeFactor of 10. I'm thinking that when there are 20 segments, the largest > segment is being merged twice - first when merging the 20 segments down to 2, > then again when merging from 2 to 1. > > I would like to avoid this if at all possible, as it requires 50% more disk > space and takes almost twice as long to optimize. Would using > TieredMergePolicy help me here, or some other config I can change? > > -Michael > -- Lance Norskog goks...@gmail.com