Which Solr version do you have? In 3.x and trunk, Tiered and
BalancedSegment are there for exactly this reason.

In Solr 1.4, your only trick is to do a partial optimize with
maxSegments. This lets you say "optimize until there are 15 segments,
then stop". Do this with smaller and smaller numbers.

On Wed, Aug 24, 2011 at 8:35 PM, Michael Ryan <mr...@moreover.com> wrote:
> I'm using Solr 3.2 with a mergeFactor of 10 and no merge policy configured, 
> thus using the default LogByteSizeMergePolicy.  Before I do an optimize, 
> typically the largest segment will be about 90% of the total index size.
>
> When I do an optimize, the total disk space required is usually about 2x the 
> index size.  But about 10% of the time, the disk space required is about 3x 
> the index size - when this happens, I see a very large segment created, 
> roughly the size of the original index size, followed by another slightly 
> larger segment.
>
> After some investigating, I found that this would happen when there were 
> exactly 20 segments in the index when the optimize started.  My hypothesis is 
> that this is a side-effect of the 20 segments being evenly divisible by the 
> mergeFactor of 10.  I'm thinking that when there are 20 segments, the largest 
> segment is being merged twice - first when merging the 20 segments down to 2, 
> then again when merging from 2 to 1.
>
> I would like to avoid this if at all possible, as it requires 50% more disk 
> space and takes almost twice as long to optimize.  Would using 
> TieredMergePolicy help me here, or some other config I can change?
>
> -Michael
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to