Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Shawn Heisey
On 6/21/2011 9:09 AM, Robert Muir wrote: the problem is that before https://issues.apache.org/jira/browse/SOLR-2567, Solr invoked the TieredMergePolicy "setters" *before* it tried to apply these 'global' mergeFactor etc params. So, even if you set them explicitly inside the, they would then get

Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Robert Muir
the problem is that before https://issues.apache.org/jira/browse/SOLR-2567, Solr invoked the TieredMergePolicy "setters" *before* it tried to apply these 'global' mergeFactor etc params. So, even if you set them explicitly inside the , they would then get clobbered by these 'global' params / defau

Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Michael McCandless
On Tue, Jun 21, 2011 at 9:42 AM, Shawn Heisey wrote: > On 6/20/2011 12:31 PM, Michael McCandless wrote: >> >> For back-compat, mergeFactor maps to both of these, but it's better to >> set them directly eg: >> >>     >>       10 >>       20 >>     >> >> (and then remove your mergeFactor setting u

Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Shawn Heisey
On 6/20/2011 12:31 PM, Michael McCandless wrote: For back-compat, mergeFactor maps to both of these, but it's better to set them directly eg: 10 20 (and then remove your mergeFactor setting under indexDefaults) When I did this and ran a reindex, it merged once it rea

Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Michael McCandless
OK that sounds like a good solution! You can also have CMS limit how many merges are allowed to run at once, if your IO system has trouble w/ that much concurrency. Mike McCandless http://blog.mikemccandless.com On Mon, Jun 20, 2011 at 6:29 PM, Shawn Heisey wrote: > On 6/20/2011 3:18 PM, Micha

Re: Optimize taking two steps and extra disk space

2011-06-20 Thread Shawn Heisey
On 6/20/2011 3:18 PM, Michael McCandless wrote: With segmentsPerTier at 35 you will easily cross 70 segs in the index... If you want optimize to run in a single merge, I would lower sementsPerTier and mergeAtOnce (maybe back to the 10 default), and set your maxMergeAtOnceExplicit to 70 or higher.

Re: Optimize taking two steps and extra disk space

2011-06-20 Thread Michael McCandless
On Mon, Jun 20, 2011 at 4:00 PM, Shawn Heisey wrote: > On 6/20/2011 12:31 PM, Michael McCandless wrote: >> >> Actually, TieredMP has two different params (different from the >> previous default LogMP): >> >>   * segmentsPerTier controls how many segments you can tolerate in the >> index (bigger nu

Re: Optimize taking two steps and extra disk space

2011-06-20 Thread Shawn Heisey
On 6/20/2011 12:31 PM, Michael McCandless wrote: Actually, TieredMP has two different params (different from the previous default LogMP): * segmentsPerTier controls how many segments you can tolerate in the index (bigger number means more segments) * maxMergeAtOnce says how many segments

Re: Optimize taking two steps and extra disk space

2011-06-20 Thread Michael McCandless
On Sun, Jun 19, 2011 at 12:35 PM, Shawn Heisey wrote: > On 6/19/2011 7:32 AM, Michael McCandless wrote: >> >> With LogXMergePolicy (the default before 3.2), optimize respects >> mergeFactor, so it's doing 2 steps because you have 37 segments but 35 >> mergeFactor. >> >> With TieredMergePolicy (def

Re: Optimize taking two steps and extra disk space

2011-06-19 Thread Shawn Heisey
On 6/19/2011 7:32 AM, Michael McCandless wrote: With LogXMergePolicy (the default before 3.2), optimize respects mergeFactor, so it's doing 2 steps because you have 37 segments but 35 mergeFactor. With TieredMergePolicy (default on 3.2 and after), there is now a separate merge factor used for op

Re: Optimize taking two steps and extra disk space

2011-06-19 Thread Michael McCandless
With LogXMergePolicy (the default before 3.2), optimize respects mergeFactor, so it's doing 2 steps because you have 37 segments but 35 mergeFactor. With TieredMergePolicy (default on 3.2 and after), there is now a separate merge factor used for optimize (maxMergeAtOnceExplicit)... so you could eg

Optimize taking two steps and extra disk space

2011-06-18 Thread Shawn Heisey
I've noticed something odd in Solr 3.2 when it does an optimize. One of my shards (freshly built via DIH full-import) had 37 segments, totalling 17.38GB of disk space. 13 of those segments were results of merges during initial import, the other 24 were untouched after creation. Starting at _