> The older MergePolicies followed a strategy which is quite disruptive in an 
> NRT environment.

Can you elaborate as to why (maybe we need to place this in a wiki)?
If large merges are running in their own thread, they should not
disrupt queries, eg, there won't be CPU contention.  The IO contention
can be disruptive, depending on the size and type of hardware, however
in the ideal case of the index 'fitting' into RAM/IO cache, then a
large merge should not affect queries (or indexing).

I think what's useful that is being developed for not disrupting NRT
with merges is DirectIOLinuxDirectory:
https://issues.apache.org/jira/browse/LUCENE-2500  It's also useful
for the non-NRT use case because anytime IO cache pages are evicted,
queries will slow down (unless the index is too large to fit in RAM
anyways).

On Sat, Jan 8, 2011 at 7:55 PM, Lance Norskog <goks...@gmail.com> wrote:
> There are always slowdowns when merging new segments during indexing.
> A MergePolicy decides when to merge segments.  The older MergePolicies
> followed a strategy which is quite disruptive in an NRT environment.
>
> There is a new feature in 3.x & the trunk called
> 'BalancedSegmentMergePolicy'. This new MergePolicy is designed for the
> near-real-time use case. It was contributed by LinkedIn. You may find
> it works well enough for your case.
>
> Lance
>
> On Thu, Jan 6, 2011 at 10:21 AM, Stephen Boesch <java...@gmail.com> wrote:
>> Thanks Yonik,
>>  Using a stable release of Solr what would you suggest to do - given
>> MultiSearch's demise and the other work is still ongoing?
>>
>> 2011/1/6 Yonik Seeley <yo...@lucidimagination.com>
>>
>>> On Thu, Jan 6, 2011 at 12:37 PM, Stephen Boesch <java...@gmail.com> wrote:
>>> > Solr/lucene newbie here ..
>>> >
>>> > We would like searches against a solr/lucene index to immediately be able
>>> to
>>> > view data that was added.  I stress "small" amount of new data given that
>>> > any significant amount would require excessive  latency.
>>>
>>> There has been significant ongoing work in lucene-core for NRT (near real
>>> time).
>>> We need to overhaul Solr's DirectUpdateHandler2 to take advantage of
>>> all this work.
>>> Mark Miller took a first crack at it (sharing a single IndexWriter,
>>> letting lucene handle the concurrency issues, etc)
>>> but if there's a JIRA issue, I'm having trouble finding it.
>>>
>>> > Looking around, i'm wondering if the direction would be a MultiSearcher
>>> > living on top of our standard directory-based IndexReader as well as a
>>> > custom Searchable that handles the newest documents - and then combines
>>> the
>>> > two results?
>>>
>>> If you look at trunk, MultiSearcher has already gone away.
>>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>>
>>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>

Reply via email to