I think the key optimization when there are no deletions is that you
don't need to renumber documents and can bulk-copy blocks of contiguous
documents, and that is independent of merge policy. I think :)
-Mike
On 01/06/2014 01:54 PM, Shawn Heisey wrote:
On 1/6/2014 11:24 AM, Otis Gospodnetic wrote:
(cross-posting to both Solr and Lucene user lists because while this
is a
Lucene-level question, I suspect a lot of people who know about this
or are
interested in this subject are actually on the Solr list)
I have a large append-only index and I looked at merge policies
hoping to
identify one that is naturally more suitable for indices without any
updates and deletions, just adds.
I've read
http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/index/TieredMergePolicy.htmland
the javadocs for its cousins, but it doesn't look like any of them is
more suited for append-only index than the other ones and Tiered MP
having
more knobs is probably the best one to use.....
I was wondering if I was missing something, if one of the MPs is in fact
better for append-only indices OR if one can suggest how one could
write a
custom MP that's specialized for append-only indices.
The Tiered policy was made default for Solr back in the 3.x days.
Defaults in both Solr and Lucene don't normally change without some
serious thought about the repercussions.
As for what's best for different kinds of indexes (add-only vs
update/delete) ... unless there are *enormous* numbers of deletions
(whether from updates or pure delete requests), I don't think that
affects the decision very much. The Tiered policy seems like it's
probably the best choice either way. I assume you've seen the
following blog post?
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
Thanks,
Shawn