Re: mixed index with commongrams

Walter Underwood Thu, 03 Aug 2017 08:59:06 -0700

How long are your GC pauses? Those affect all queries, so they make the 99th 
percentile slow with queries that should be fast.


The G1 collector has helped our 99th percentile.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Aug 3, 2017, at 8:48 AM, David Hastings <hastings.recurs...@gmail.com> 
> wrote:
> 
> Thanks, thats what i kind of expected.  still debating whether the space
> increase is worth it, right now Im at .7% of searches taking longer than 10
> seconds, and 6% taking longer than 1, so when i see things like this in the
> morning it bugs me a bit:
> 
> 2017-08-02 11:50:48 : 58979/1000 secs : ("Rules of Practice for the Courts
> of Equity of the United States")
> 2017-08-02 02:16:36 : 54749/1000 secs : ("The American Cause")
> 2017-08-02 19:27:58 : 54561/1000 secs : ("register of the department of
> justice")
> 
> which could all be annihilated with CG's, at the expense, according to HT,
> of a 40% increase in index size.
> 
> 
> 
> On Thu, Aug 3, 2017 at 11:21 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
> 
>> bq: will that search still return results form the earlier documents
>> as well as the new ones
>> 
>> In a word, "no". By definition the analysis chain applied at index
>> time puts tokens in the index and that's all you have to search
>> against for the doc unless and until you re-index the document.
>> 
>> You really have two choices here:
>> 1> live with the differing results until you get done re-indexing
>> 2> index to an offline collection and then use, say, collection
>> aliasing to make the switch atomically.
>> 
>> Best,
>> Erick
>> 
>> On Thu, Aug 3, 2017 at 8:07 AM, David Hastings
>> <hastings.recurs...@gmail.com> wrote:
>>> Hey all, I have yet to run an experiment to test this but was wondering
>> if
>>> anyone knows the answer ahead of time.
>>> If i have an index built with documents before implementing the
>> commongrams
>>> filter, then enable it, and start adding documents that have the
>>> filter/tokenizer applied, will searches that fit the criteria, for
>> example:
>>> "to be or not to be"
>>> will that search still return results form the earlier documents as well
>> as
>>> the new ones?  The idea is that a full re-index is going to be difficult,
>>> so would rather do it over time by replacing large numbers of documents
>>> incrementally.  Thanks,
>>> Dave
>>

Re: mixed index with commongrams

Reply via email to