Dear Shawn,
Thanks for your reply. For now, I did merges in steps with maxSegments
param (using HOST:PORT/CORE/update?optimize=true&maxSegments=10). First I
merged the 45 segments to 10, and then from 10 to 5. (Merging from 5 to 2
again caused out-of-memory exception.) Now I have a 5-segment index with
all segments roughly of equal sizes. Will try using that and see if that is
good enough for us.


On Sun, Feb 9, 2014 at 11:22 PM, Shawn Heisey <s...@elyograg.org> wrote:

> On 2/9/2014 11:41 PM, Arun Rangarajan wrote:
> > I have a 28 GB Solr 4.6 index with 45 segments. Optimize failed with an
> > 'out of memory' error. Is optimize really necessary, since I read that
> > lucene is able to handle multiple segments well now?
>
> I have had indexes with more than 45 segments, because of the merge
> settings that I use.  My large index shards are about 16GB at the
> moment.  Out of memory errors are very rare because I use a fairly large
> heap, at 6GB for a machine that hosts three of these large shards.  When
> I was still experimenting with my memory settings, I did see occasional
> out of memory errors during normal segment merging.
>
> Increasing your heap size is pretty much required at this point.  I've
> condensed some very basic information about heap sizing here:
>
> http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
>
> As for whether optimizing on 4.x is necessary: I do not have any hard
> numbers for you, but I can tell you that an optimized index does seem
> noticeably faster than one that is freshly built and has has a large
> number of relatively large segments.
>
> I optimize my index shards on an schedule, but it is relatively
> infrequent -- one large shard per night.  Most of the time what I have
> is one really large segment and a bunch of super-small segments, and
> that does not seem to suffer from performance issues compared to a fully
> optimized index.  The situation is different right after a fresh
> rebuild, which produces a handful of very large segments and a bunch of
> smaller segments of varying sizes.
>
> Interesting but probably irrelevant details:
>
> Although I don't use mergeFactor any more, the TieredMergePolicy
> settings that I use are equivalent to a mergeFactor of 35.  I chose this
> number back in the 1.4.1 days because it resulted in synchronicity
> between merges and lucene segment names when LogByteSizeMergePolicy was
> still in use.  Segments _0 through _z would be merged into segment _10,
> and so on.
>
> Thanks,
> Shawn
>
>

Reply via email to