date:20190115

RE: Delayed/waiting requests

2019-01-15 Thread Gael Jourdan-Weil

Hi Erick,

Thank you for your detailed answer, I better understand autowarming.

We have an autowarming time of ~10s for filterCache (queryResultCache is not 
used at all, ratio = 0.02).

We increased the size of the filterCache from 6k to 12k (and autowarming size 
set to same values) to have a better ratio which is _only_ around 0.85/0.90.

The thing I don't understand is I should see "Opening new searcher" in the logs 
everytime a new searcher is opened and thus an autowarming happens, right?

But I don't see "Opening new searcher" very often, and I don't see it being 
correlated with the response time peaks.

Also, I didn't mention it earlier but, we have other SolrCloud clusters with 
similar settings and load (~10s filterCache autowarming, 10k entries) and we 
don't observe the same behavior.

Regards,

De : Erick Erickson 
Envoyé : lundi 14 janvier 2019 17:44:38
À : solr-user
Objet : Re: Delayed/waiting requests

Gael:

bq. Nevertheless, our filterCache is set to autowarm 12k entries which
is also the maxSize

That is far, far, far too many. Let's assume you actually have 12K
entries in the filterCache.
Every time you open a new searcher, 12K queries are executed _before_
the searcher
accepts any new requests. While being able to re-use a filterCache
entry is useful, one of
the primary purposes is to pre-load index data from disk into memory
which can be
the event that takes the most time.

The queryResultCache has a similar function. I often find that this
cache doesn't have a
very high hit ratio, but again executing a _few_ of these queries
warms the index from
disk.

I think of both caches as a map, where the key is the "thing", (fq
clause in the case
of filterCache, the whole query in the case of the queryResultCache).
Autowarming
replays the most recently executed N of these entries, essentially
just as though
they were submitted by a user.

Hypothesis: You're massively over-warming, and when that kicks in you're seeing
increased CPU and GC pressure leading to the anomalies you're seeing. Further,
you have such excessive autowarming going on that it's hard to see the
associated messages in the log.

Here's what I'd recommend: Set your autowarm counts to something on the order
of 16. If the culprit is just excessive autowarming, I'd expect your spikes to
be much less severe. It _might_ be that your users see some increased (very
temporary) variance in response time. You can tell that the autowarming
configurations are "more art than science", I can't give you any other
recommendations than "start small and increase until you're happy"
unfortunately.

I usually do this with some kind of load tester in a dev lab of course ;).

Finally, if you use the metrics data (see:
https://lucene.apache.org/solr/guide/7_1/metrics-reporting.html)
you can see the autowarm times. Don't get too lost in the page to
start, just hit the "http://localhost:8983/solr/admin/metrics"; endpoint
and look for "warmupTime", then refine on how to get _only_
the warmup stats ;).

Best,
Erick

On Mon, Jan 14, 2019 at 5:08 AM Gael Jourdan-Weil
 wrote:
>
> I had a look to GC logs this morning but I'm not sure how to interpret them.
>
>
> Over a period of 54mn, there is:
>
> - Number of pauses: 2739
>
> - Accumulated pauses: 93s => that is 2.86% of the time
>
> - Average pause duration: 0.03s
>
> - Average pause interval: 1.18s
>
> - Accumulated full GC: 0
>
> I'm not sure if this is a lot or not. What do you think ?
>
>
> Looking more closely to GC logs with GC Viewer, I can notice that the high 
> response time peaks happens at the same time where GC pauses takes 2x more 
> time (around 0.06s) than average.
>
>
> Also we are indeed indexing at the same time but we have autowarming set.
>
> I don't see any Searcher opened at the time we experience slowness.
>
> Nevertheless, our filterCache is set to autowarm 12k entries which is also 
> the maxSize.
>
> Could this have any downside?
>
>
> Thanks,
>
> Gaël
>
>
> 
> De : Erick Erickson 
> Envoyé : vendredi 11 janvier 2019 17:21
> À : solr-user
> Objet : Re: Delayed/waiting requests
>
> Jimi's comment is one of the very common culprits.
>
> Autowarming is another. Are you indexing at the same
> time? If so it could well be  you aren't autowarming and
> the spikes are caused by using a new IndexSearcher
> that has to read much of the index off disk when commits
> happen. The "smoking gun" here would be if the spikes
> correlate to your commits (soft or hard-with-opensearcher-true).
>
> Best,
> Erick
>
> On Fri, Jan 11, 2019 at 1:23 AM Gael Jourdan-Weil
>  wrote:
> >
> > Interesting indeed, we did not see anything with VisualVM but having a look 
> > at the GC logs could gives us more info, especially on the pauses.
> >
> > I will collect data over the week-end and look at it.
> >
> >
> > Thanks
> >
> > 
> > De : Hullegård, Jimi 
> > Envoyé : vendredi 11 janvier 2019 03:46:02
> > À :

join query and new searcher on joined collection

2019-01-15 Thread Vadim Ivanov

Solr 6.3

 

I have a query like this:

q=*:*{!join score=none from=id fromIndex=hss_4 to=rpk_hdquotes v=$qq}*:*

 

-- 

Vadim

RE: join query and new searcher on joined collection

2019-01-15 Thread Vadim Ivanov

Sory, I've sent unfinished message
So, query on collection1
q=*:*{!join score=none from=id fromIndex=collection2 to=field1}*:*

The question is what happened with autowarming and new searchers on
collection1 when new searcher starts on collection2?
IMHO when request with join comes it's impossible to use caches on
collection1 and ...
Does new searcher starts on collection1 as well?


> -Original Message-
> From: Vadim Ivanov [mailto:vadim.iva...@spb.ntk-intourist.ru]
> Sent: Tuesday, January 15, 2019 1:00 PM
> To: solr-user@lucene.apache.org
> Subject: join query and new searcher on joined collection
> 
> Solr 6.3
> 
> 
> 
> I have a query like this:
> 
> q=*:*{!join score=none from=id fromIndex=hss_4 to=rpk_hdquotes v=$qq}*:*
> 
> 
> 
> --
> 
> Vadim
> 
>

Re: "no servers hosting shard" when querying during shard creation

2019-01-15 Thread Bram Van Dam

On 13/01/2019 19:43, Erick Erickson wrote:
> Yeah, that seems wrong, I'd say open a JIRA.

I've created a bug in Jira: SOLR-13136. Should I assign this to anyone?
Unsure what the procedure is there.

Incidentally, while doing so I noticed that 7.6 is still "unreleased"
according to Jira.

Thanks,

 - Bram

Re: DateRangeField requires month?

2019-01-15 Thread Mikhail Khludnev

I did some testing by tweaking DateRangeFieldTest and witness that
2000-11T13 is parsed as 2000-11-13 see
https://github.com/apache/lucene-solr/blob/f083473b891e596def2877b5429fcfa6db175464/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/tree/DateRangePrefixTree.java#L462
Don't know what to do with it... At least I'm going to update the doc.

On Mon, Jan 14, 2019 at 4:42 PM Jeremy Smith  wrote:

> Hi Mikhail, thanks for the response.  I'm probably missing something, but
> what makes 2000-11T13 contiguous and 2000T13 not contiguous?  They seem
> pretty similar to me, but only the former is supported.
>
>
> Thanks,
>
> Jeremy
>
> 
> From: Mikhail Khludnev 
> Sent: Sunday, January 13, 2019 12:59:31 AM
> To: solr-user
> Subject: Re: DateRangeField requires month?
>
> Hello, Jeremy.
>
> See below.
>
> On Mon, Jan 7, 2019 at 5:09 PM Jeremy Smith  wrote:
>
> > Hello,
> >
> >  I am trying to use the DateRangeField and ran into an interesting
> > issue.  According to the documentation (
> > https://lucene.apache.org/solr/guide/7_6/working-with-dates.html), these
> > are both valid for the DateRangeField: 2000-11 and 2000-11T13.  I can
> > confirm this is working in 7.6.  I would also expect to be able to use
> > 2000T13, which would mean any time in the year 2000 between 1300 and
> 1400.
>
>
> Nope. This is not a range, but multiple ranges. DateRangeField supports
> contiguous ranges only.
>
>
> > However, I get an error when trying to insert this value:
> >
> >
> > "error":{"metadata":
> >
> >
> >
> ["error-class","org.apache.solr.common.SolrException","root-error-class","java.lang.NumberFormatException"],
> >
> > "msg":"ERROR: Error adding field 'dtRange'='2000T13' msg=Couldn't
> > parse date because: Improperly formatted date: 2000T13","code":400
> >
> > }
> >
> >
> > I am using 7.6 with a super simple schema containing only _version_ and a
> > DateRangeField and there's nothing special in my solrconfig.xml.  Is this
> > behavior expected?  Should I open a jira issue?
> >
> >
> > Thanks,
> >
> > Jeremy
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
Sincerely yours
Mikhail Khludnev

Re: DateRangeField requires month?

2019-01-15 Thread Mikhail Khludnev

Follow up https://issues.apache.org/jira/browse/SOLR-13139

On Tue, Jan 15, 2019 at 2:46 PM Mikhail Khludnev  wrote:

> I did some testing by tweaking DateRangeFieldTest and witness that
> 2000-11T13 is parsed as 2000-11-13 see
>
> https://github.com/apache/lucene-solr/blob/f083473b891e596def2877b5429fcfa6db175464/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/tree/DateRangePrefixTree.java#L462
> Don't know what to do with it... At least I'm going to update the doc.
>
> On Mon, Jan 14, 2019 at 4:42 PM Jeremy Smith  wrote:
>
>> Hi Mikhail, thanks for the response.  I'm probably missing something, but
>> what makes 2000-11T13 contiguous and 2000T13 not contiguous?  They seem
>> pretty similar to me, but only the former is supported.
>>
>>
>> Thanks,
>>
>> Jeremy
>>
>> 
>> From: Mikhail Khludnev 
>> Sent: Sunday, January 13, 2019 12:59:31 AM
>> To: solr-user
>> Subject: Re: DateRangeField requires month?
>>
>> Hello, Jeremy.
>>
>> See below.
>>
>> On Mon, Jan 7, 2019 at 5:09 PM Jeremy Smith  wrote:
>>
>> > Hello,
>> >
>> >  I am trying to use the DateRangeField and ran into an interesting
>> > issue.  According to the documentation (
>> > https://lucene.apache.org/solr/guide/7_6/working-with-dates.html),
>> these
>> > are both valid for the DateRangeField: 2000-11 and 2000-11T13.  I can
>> > confirm this is working in 7.6.  I would also expect to be able to use
>> > 2000T13, which would mean any time in the year 2000 between 1300 and
>> 1400.
>>
>>
>> Nope. This is not a range, but multiple ranges. DateRangeField supports
>> contiguous ranges only.
>>
>>
>> > However, I get an error when trying to insert this value:
>> >
>> >
>> > "error":{"metadata":
>> >
>> >
>> >
>> ["error-class","org.apache.solr.common.SolrException","root-error-class","java.lang.NumberFormatException"],
>> >
>> > "msg":"ERROR: Error adding field 'dtRange'='2000T13' msg=Couldn't
>> > parse date because: Improperly formatted date: 2000T13","code":400
>> >
>> > }
>> >
>> >
>> > I am using 7.6 with a super simple schema containing only _version_ and
>> a
>> > DateRangeField and there's nothing special in my solrconfig.xml.  Is
>> this
>> > behavior expected?  Should I open a jira issue?
>> >
>> >
>> > Thanks,
>> >
>> > Jeremy
>> >
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
Sincerely yours
Mikhail Khludnev

Re: join query and new searcher on joined collection

2019-01-15 Thread Mikhail Khludnev

collection1 has no idea about new searcher in collection2.

On Tue, Jan 15, 2019 at 1:18 PM Vadim Ivanov <
vadim.iva...@spb.ntk-intourist.ru> wrote:

> Sory, I've sent unfinished message
> So, query on collection1
> q=*:*{!join score=none from=id fromIndex=collection2 to=field1}*:*
>
> The question is what happened with autowarming and new searchers on
> collection1 when new searcher starts on collection2?
> IMHO when request with join comes it's impossible to use caches on
> collection1 and ...
> Does new searcher starts on collection1 as well?
>
>
> > -Original Message-
> > From: Vadim Ivanov [mailto:vadim.iva...@spb.ntk-intourist.ru]
> > Sent: Tuesday, January 15, 2019 1:00 PM
> > To: solr-user@lucene.apache.org
> > Subject: join query and new searcher on joined collection
> >
> > Solr 6.3
> >
> >
> >
> > I have a query like this:
> >
> > q=*:*{!join score=none from=id fromIndex=hss_4 to=rpk_hdquotes v=$qq}*:*
> >
> >
> >
> > --
> >
> > Vadim
> >
> >
>
>
>

-- 
Sincerely yours
Mikhail Khludnev

Re: DateRangeField requires month?

2019-01-15 Thread Jeremy Smith

Thanks Mikhail, I think the change you proposed to the documentation will be 
helpful to avoid this confusion.


From: Mikhail Khludnev 
Sent: Tuesday, January 15, 2019 8:47:17 AM
To: solr-user
Subject: Re: DateRangeField requires month?

Follow up https://issues.apache.org/jira/browse/SOLR-13139

On Tue, Jan 15, 2019 at 2:46 PM Mikhail Khludnev  wrote:

> I did some testing by tweaking DateRangeFieldTest and witness that
> 2000-11T13 is parsed as 2000-11-13 see
>
> https://github.com/apache/lucene-solr/blob/f083473b891e596def2877b5429fcfa6db175464/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/tree/DateRangePrefixTree.java#L462
> Don't know what to do with it... At least I'm going to update the doc.
>
> On Mon, Jan 14, 2019 at 4:42 PM Jeremy Smith  wrote:
>
>> Hi Mikhail, thanks for the response.  I'm probably missing something, but
>> what makes 2000-11T13 contiguous and 2000T13 not contiguous?  They seem
>> pretty similar to me, but only the former is supported.
>>
>>
>> Thanks,
>>
>> Jeremy
>>
>> 
>> From: Mikhail Khludnev 
>> Sent: Sunday, January 13, 2019 12:59:31 AM
>> To: solr-user
>> Subject: Re: DateRangeField requires month?
>>
>> Hello, Jeremy.
>>
>> See below.
>>
>> On Mon, Jan 7, 2019 at 5:09 PM Jeremy Smith  wrote:
>>
>> > Hello,
>> >
>> >  I am trying to use the DateRangeField and ran into an interesting
>> > issue.  According to the documentation (
>> > https://lucene.apache.org/solr/guide/7_6/working-with-dates.html),
>> these
>> > are both valid for the DateRangeField: 2000-11 and 2000-11T13.  I can
>> > confirm this is working in 7.6.  I would also expect to be able to use
>> > 2000T13, which would mean any time in the year 2000 between 1300 and
>> 1400.
>>
>>
>> Nope. This is not a range, but multiple ranges. DateRangeField supports
>> contiguous ranges only.
>>
>>
>> > However, I get an error when trying to insert this value:
>> >
>> >
>> > "error":{"metadata":
>> >
>> >
>> >
>> ["error-class","org.apache.solr.common.SolrException","root-error-class","java.lang.NumberFormatException"],
>> >
>> > "msg":"ERROR: Error adding field 'dtRange'='2000T13' msg=Couldn't
>> > parse date because: Improperly formatted date: 2000T13","code":400
>> >
>> > }
>> >
>> >
>> > I am using 7.6 with a super simple schema containing only _version_ and
>> a
>> > DateRangeField and there's nothing special in my solrconfig.xml.  Is
>> this
>> > behavior expected?  Should I open a jira issue?
>> >
>> >
>> > Thanks,
>> >
>> > Jeremy
>> >
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


--
Sincerely yours
Mikhail Khludnev

RE: join query and new searcher on joined collection

2019-01-15 Thread Vadim Ivanov

Thanx, Mikhail for reply
> collection1 has no idea about new searcher in collection2.
I suspected it. :) 

So, when "join" query arrives searcher on collection1 has no chance to use 
filter cache, stored before.
I suppose it invalidates filter cache, am I right?

&fq={!join score=none from=id fromIndex=collection2 to=field1}*:* 
 
> On Tue, Jan 15, 2019 at 1:18 PM Vadim Ivanov <
> vadim.iva...@spb.ntk-intourist.ru> wrote:
> 
> > Sory, I've sent unfinished message
> > So, query on collection1
> > q=*:*{!join score=none from=id fromIndex=collection2 to=field1}*:*
> >
> > The question is what happened with autowarming and new searchers on
> > collection1 when new searcher starts on collection2?
> > IMHO when request with join comes it's impossible to use caches on
> > collection1 and ...
> > Does new searcher starts on collection1 as well?
> >
> >
> > > -Original Message-
> > > From: Vadim Ivanov [mailto:vadim.iva...@spb.ntk-intourist.ru]
> > > Sent: Tuesday, January 15, 2019 1:00 PM
> > > To: solr-user@lucene.apache.org
> > > Subject: join query and new searcher on joined collection
> > >
> > > Solr 6.3
> > >
> > >
> > >
> > > I have a query like this:
> > >
> > > q=*:*{!join score=none from=id fromIndex=hss_4 to=rpk_hdquotes
> v=$qq}*:*
> > >
> > >
> > >
> > > --
> > >
> > > Vadim
> > >
> > >
> >
> >
> >
> 
> --
> Sincerely yours
> Mikhail Khludnev

Re: join query and new searcher on joined collection

2019-01-15 Thread Mikhail Khludnev

It doesn't invalidate anything. It just doesn't matches to the join query
from older collection2 see
https://github.com/apache/lucene-solr/blob/b7f99fe55a6fb6e7b38828676750b3512d6899a1/solr/core/src/java/org/apache/solr/search/JoinQParserPlugin.java#L570
So, after commit collection2 following join at collection1 just won't hit
filter cache, and will be cached as new entry and lately the old entry will
be evicted.

On Tue, Jan 15, 2019 at 5:30 PM Vadim Ivanov <
vadim.iva...@spb.ntk-intourist.ru> wrote:

> Thanx, Mikhail for reply
> > collection1 has no idea about new searcher in collection2.
> I suspected it. :)
>
> So, when "join" query arrives searcher on collection1 has no chance to use
> filter cache, stored before.
> I suppose it invalidates filter cache, am I right?
>
> &fq={!join score=none from=id fromIndex=collection2 to=field1}*:*
>
> > On Tue, Jan 15, 2019 at 1:18 PM Vadim Ivanov <
> > vadim.iva...@spb.ntk-intourist.ru> wrote:
> >
> > > Sory, I've sent unfinished message
> > > So, query on collection1
> > > q=*:*{!join score=none from=id fromIndex=collection2 to=field1}*:*
> > >
> > > The question is what happened with autowarming and new searchers on
> > > collection1 when new searcher starts on collection2?
> > > IMHO when request with join comes it's impossible to use caches on
> > > collection1 and ...
> > > Does new searcher starts on collection1 as well?
> > >
> > >
> > > > -Original Message-
> > > > From: Vadim Ivanov [mailto:vadim.iva...@spb.ntk-intourist.ru]
> > > > Sent: Tuesday, January 15, 2019 1:00 PM
> > > > To: solr-user@lucene.apache.org
> > > > Subject: join query and new searcher on joined collection
> > > >
> > > > Solr 6.3
> > > >
> > > >
> > > >
> > > > I have a query like this:
> > > >
> > > > q=*:*{!join score=none from=id fromIndex=hss_4 to=rpk_hdquotes
> > v=$qq}*:*
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Vadim
> > > >
> > > >
> > >
> > >
> > >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
>
>

-- 
Sincerely yours
Mikhail Khludnev

Re: Delayed/waiting requests

2019-01-15 Thread Erick Erickson

Well, it was a nice theory anyway.

"Other collections with the same settings"
doesn't really mean much unless those other collections are very similar,
especially in terms of numbers of docs.

You should only see a new searcher opening when you do a
hard-commit-with-opensearcher-true or soft commit.

So what happens when you just try lowering the autowarm
count? I'm assuming you're free to test in some non-prod
system.

Focusing on the hit ratio is something of a red herring. Remember
that each entry in your filterCache is roughly maxDoc/8 + a little
overhead, the increase in GC pressure has to be balanced
against getting the hits from the cache.

Now, all that said if there's no correlation, then you need to put
a profiler on the system when you see this kind of thing and
find out where the hotspots are, otherwise it's guesswork and
I'm out of ideas.

Best,
Erick

On Tue, Jan 15, 2019 at 12:06 AM Gael Jourdan-Weil
 wrote:
>
> Hi Erick,
>
>
> Thank you for your detailed answer, I better understand autowarming.
>
>
> We have an autowarming time of ~10s for filterCache (queryResultCache is not 
> used at all, ratio = 0.02).
>
> We increased the size of the filterCache from 6k to 12k (and autowarming size 
> set to same values) to have a better ratio which is _only_ around 0.85/0.90.
>
>
> The thing I don't understand is I should see "Opening new searcher" in the 
> logs everytime a new searcher is opened and thus an autowarming happens, 
> right?
>
> But I don't see "Opening new searcher" very often, and I don't see it being 
> correlated with the response time peaks.
>
>
> Also, I didn't mention it earlier but, we have other SolrCloud clusters with 
> similar settings and load (~10s filterCache autowarming, 10k entries) and we 
> don't observe the same behavior.
>
>
> Regards,
>
> 
> De : Erick Erickson 
> Envoyé : lundi 14 janvier 2019 17:44:38
> À : solr-user
> Objet : Re: Delayed/waiting requests
>
> Gael:
>
> bq. Nevertheless, our filterCache is set to autowarm 12k entries which
> is also the maxSize
>
> That is far, far, far too many. Let's assume you actually have 12K
> entries in the filterCache.
> Every time you open a new searcher, 12K queries are executed _before_
> the searcher
> accepts any new requests. While being able to re-use a filterCache
> entry is useful, one of
> the primary purposes is to pre-load index data from disk into memory
> which can be
> the event that takes the most time.
>
> The queryResultCache has a similar function. I often find that this
> cache doesn't have a
> very high hit ratio, but again executing a _few_ of these queries
> warms the index from
> disk.
>
> I think of both caches as a map, where the key is the "thing", (fq
> clause in the case
> of filterCache, the whole query in the case of the queryResultCache).
> Autowarming
> replays the most recently executed N of these entries, essentially
> just as though
> they were submitted by a user.
>
> Hypothesis: You're massively over-warming, and when that kicks in you're 
> seeing
> increased CPU and GC pressure leading to the anomalies you're seeing. Further,
> you have such excessive autowarming going on that it's hard to see the
> associated messages in the log.
>
> Here's what I'd recommend: Set your autowarm counts to something on the order
> of 16. If the culprit is just excessive autowarming, I'd expect your spikes to
> be much less severe. It _might_ be that your users see some increased (very
> temporary) variance in response time. You can tell that the autowarming
> configurations are "more art than science", I can't give you any other
> recommendations than "start small and increase until you're happy"
> unfortunately.
>
> I usually do this with some kind of load tester in a dev lab of course ;).
>
> Finally, if you use the metrics data (see:
> https://lucene.apache.org/solr/guide/7_1/metrics-reporting.html)
> you can see the autowarm times. Don't get too lost in the page to
> start, just hit the "http://localhost:8983/solr/admin/metrics"; endpoint
> and look for "warmupTime", then refine on how to get _only_
> the warmup stats ;).
>
> Best,
> Erick
>
> On Mon, Jan 14, 2019 at 5:08 AM Gael Jourdan-Weil
>  wrote:
> >
> > I had a look to GC logs this morning but I'm not sure how to interpret them.
> >
> >
> > Over a period of 54mn, there is:
> >
> > - Number of pauses: 2739
> >
> > - Accumulated pauses: 93s => that is 2.86% of the time
> >
> > - Average pause duration: 0.03s
> >
> > - Average pause interval: 1.18s
> >
> > - Accumulated full GC: 0
> >
> > I'm not sure if this is a lot or not. What do you think ?
> >
> >
> > Looking more closely to GC logs with GC Viewer, I can notice that the high 
> > response time peaks happens at the same time where GC pauses takes 2x more 
> > time (around 0.06s) than average.
> >
> >
> > Also we are indeed indexing at the same time but we have autowarming set.
> >
> > I don't see any Searcher opened at the time we experience slowness.
> >

Re: Re: Delayed/waiting requests

2019-01-15 Thread Branham, Jeremy (Experis)

Hi Gael –

Could you share this information?
Size of the index
Server memory available
Server CPU count
JVM memory settings

You mentioned a cloud configuration of 3 replicas.
Does that mean you have 1 shard with a replication factor of 3?
Do the pauses occur on all 3 servers?
Is the traffic evenly balanced across those servers?

 
Jeremy Branham
jb...@allstate.com


On 1/15/19, 9:50 AM, "Erick Erickson"  wrote:

Well, it was a nice theory anyway.

"Other collections with the same settings"
doesn't really mean much unless those other collections are very similar,
especially in terms of numbers of docs.

You should only see a new searcher opening when you do a
hard-commit-with-opensearcher-true or soft commit.

So what happens when you just try lowering the autowarm
count? I'm assuming you're free to test in some non-prod
system.

Focusing on the hit ratio is something of a red herring. Remember
that each entry in your filterCache is roughly maxDoc/8 + a little
overhead, the increase in GC pressure has to be balanced
against getting the hits from the cache.

Now, all that said if there's no correlation, then you need to put
a profiler on the system when you see this kind of thing and
find out where the hotspots are, otherwise it's guesswork and
I'm out of ideas.

Best,
Erick

On Tue, Jan 15, 2019 at 12:06 AM Gael Jourdan-Weil
 wrote:
>
> Hi Erick,
>
>
> Thank you for your detailed answer, I better understand autowarming.
>
>
> We have an autowarming time of ~10s for filterCache (queryResultCache is 
not used at all, ratio = 0.02).
>
> We increased the size of the filterCache from 6k to 12k (and autowarming 
size set to same values) to have a better ratio which is _only_ around 
0.85/0.90.
>
>
> The thing I don't understand is I should see "Opening new searcher" in 
the logs everytime a new searcher is opened and thus an autowarming happens, 
right?
>
> But I don't see "Opening new searcher" very often, and I don't see it 
being correlated with the response time peaks.
>
>
> Also, I didn't mention it earlier but, we have other SolrCloud clusters 
with similar settings and load (~10s filterCache autowarming, 10k entries) and 
we don't observe the same behavior.
>
>
> Regards,
>
> 
> De : Erick Erickson 
> Envoyé : lundi 14 janvier 2019 17:44:38
> À : solr-user
> Objet : Re: Delayed/waiting requests
>
> Gael:
>
> bq. Nevertheless, our filterCache is set to autowarm 12k entries which
> is also the maxSize
>
> That is far, far, far too many. Let's assume you actually have 12K
> entries in the filterCache.
> Every time you open a new searcher, 12K queries are executed _before_
> the searcher
> accepts any new requests. While being able to re-use a filterCache
> entry is useful, one of
> the primary purposes is to pre-load index data from disk into memory
> which can be
> the event that takes the most time.
>
> The queryResultCache has a similar function. I often find that this
> cache doesn't have a
> very high hit ratio, but again executing a _few_ of these queries
> warms the index from
> disk.
>
> I think of both caches as a map, where the key is the "thing", (fq
> clause in the case
> of filterCache, the whole query in the case of the queryResultCache).
> Autowarming
> replays the most recently executed N of these entries, essentially
> just as though
> they were submitted by a user.
>
> Hypothesis: You're massively over-warming, and when that kicks in you're 
seeing
> increased CPU and GC pressure leading to the anomalies you're seeing. 
Further,
> you have such excessive autowarming going on that it's hard to see the
> associated messages in the log.
>
> Here's what I'd recommend: Set your autowarm counts to something on the 
order
> of 16. If the culprit is just excessive autowarming, I'd expect your 
spikes to
> be much less severe. It _might_ be that your users see some increased 
(very
> temporary) variance in response time. You can tell that the autowarming
> configurations are "more art than science", I can't give you any other
> recommendations than "start small and increase until you're happy"
> unfortunately.
>
> I usually do this with some kind of load tester in a dev lab of course ;).
>
> Finally, if you use the metrics data (see:
> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_7-5F1_metrics-2Dreporting.html&d=DwIFaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=h6jTb9n4NnmdKzYWrvtmR4Hx9AKJvlxPH538vyXpE30&s=9BWTVr32mplsfAWQ3hnWuVx5V1cL_RgLNDDpg8S2mtk&e=)
> you can see the autowarm times. Don't get

RE: Re: Delayed/waiting requests

2019-01-15 Thread Gael Jourdan-Weil

@Erick:

We will try to lower the autowarm and run some tests to compare.

If I get your point, having a big cache might cause more troubles than help if 
the cache hit ratio is not high enough because the cache is constantly 
evicting/inserting entries?

@Jeremy:

Index size: ~20G and ~14M documents

Server memory available: 256G from which ~30G used and ~100G system cache

Server CPU count: 32, ~10% usage

JVM memory settings: -Xms12G -Xmx12G

We have 3 servers and 3 clusters of 3 Solr instances.

That is each server hosts 1 Solr instance for each cluster.

And, indeed, each cluster only has 1 shard with replication factor 3.

Among all these Solr instances, the pauses are observed on only one single 
cluster but on every server at different times (sometimes on all servers at the 
same time but I would say it's very rare).

We do observe the traffic is evenly balanced across the 3 servers, around 30-40 
queries per second sent to each server.

Regards,

Gaël

De : Branham, Jeremy (Experis) 
Envoyé : mardi 15 janvier 2019 17:59:56
À : solr-user@lucene.apache.org
Objet : Re: Re: Delayed/waiting requests

Hi Gael –

Could you share this information?
Size of the index
Server memory available
Server CPU count
JVM memory settings

You mentioned a cloud configuration of 3 replicas.
Does that mean you have 1 shard with a replication factor of 3?
Do the pauses occur on all 3 servers?
Is the traffic evenly balanced across those servers?

Jeremy Branham
jb...@allstate.com

On 1/15/19, 9:50 AM, "Erick Erickson"  wrote:

Well, it was a nice theory anyway.

"Other collections with the same settings"
doesn't really mean much unless those other collections are very similar,
especially in terms of numbers of docs.

You should only see a new searcher opening when you do a
hard-commit-with-opensearcher-true or soft commit.

So what happens when you just try lowering the autowarm
count? I'm assuming you're free to test in some non-prod
system.

Focusing on the hit ratio is something of a red herring. Remember
that each entry in your filterCache is roughly maxDoc/8 + a little
overhead, the increase in GC pressure has to be balanced
against getting the hits from the cache.

Now, all that said if there's no correlation, then you need to put
a profiler on the system when you see this kind of thing and
find out where the hotspots are, otherwise it's guesswork and
I'm out of ideas.

Best,
Erick

On Tue, Jan 15, 2019 at 12:06 AM Gael Jourdan-Weil
 wrote:
>
> Hi Erick,
>
>
> Thank you for your detailed answer, I better understand autowarming.
>
>
> We have an autowarming time of ~10s for filterCache (queryResultCache is 
not used at all, ratio = 0.02).
>
> We increased the size of the filterCache from 6k to 12k (and autowarming 
size set to same values) to have a better ratio which is _only_ around 
0.85/0.90.
>
>
> The thing I don't understand is I should see "Opening new searcher" in 
the logs everytime a new searcher is opened and thus an autowarming happens, 
right?
>
> But I don't see "Opening new searcher" very often, and I don't see it 
being correlated with the response time peaks.
>
>
> Also, I didn't mention it earlier but, we have other SolrCloud clusters 
with similar settings and load (~10s filterCache autowarming, 10k entries) and 
we don't observe the same behavior.
>
>
> Regards,
>
> 
> De : Erick Erickson 
> Envoyé : lundi 14 janvier 2019 17:44:38
> À : solr-user
> Objet : Re: Delayed/waiting requests
>
> Gael:
>
> bq. Nevertheless, our filterCache is set to autowarm 12k entries which
> is also the maxSize
>
> That is far, far, far too many. Let's assume you actually have 12K
> entries in the filterCache.
> Every time you open a new searcher, 12K queries are executed _before_
> the searcher
> accepts any new requests. While being able to re-use a filterCache
> entry is useful, one of
> the primary purposes is to pre-load index data from disk into memory
> which can be
> the event that takes the most time.
>
> The queryResultCache has a similar function. I often find that this
> cache doesn't have a
> very high hit ratio, but again executing a _few_ of these queries
> warms the index from
> disk.
>
> I think of both caches as a map, where the key is the "thing", (fq
> clause in the case
> of filterCache, the whole query in the case of the queryResultCache).
> Autowarming
> replays the most recently executed N of these entries, essentially
> just as though
> they were submitted by a user.
>
> Hypothesis: You're massively over-warming, and when that kicks in you're 
seeing
> increased CPU and GC pressure leading to the anomalies

Can Solr 4.10 work with JDK11

2019-01-15 Thread Pushkar Raste

I probably already know the answer for this but was still wondering.

Re: Re: Delayed/waiting requests

2019-01-15 Thread Erick Erickson

bq. If I get your point, having a big cache might cause more troubles
than help if the cache hit ratio is not high enough because the cache
is constantly evicting/inserting entries?

Pretty much. Although there are nuances.

Right now, you have a 12K autowarm count. That means your cache will
eventually always contain 12K entries whether or not you ever use the
last 11K! I'm simplifying a bit, but it grows like this.

Let's say I start Solr. Initially it has no cache entries. Now I start
both querying and indexing. For simplicity, say I have 100 _new_  fq
clauses come in between each commit. The first commit will autowarm
100. The next will autowarm 200, then 300.. etc. Eventually this
will grow to 12K. So your performance will start to vary depending on
how long Solr has been running.

Worse. it's not clear that you _ever_ re-use those clauses. One example:
fq=date_field:[* TO NOW]
NOW is really a Unix timestamp. So issuing the same fq 1 millisecond
from the first one will not re-use the entry. In the worst case almost
all of your autwarming is useless. It neither loads relevant index
data into RAM nor is reusable.

Even if you use "date math" to round to, say, a minute, if you run
Solr long enough you'll still fill up with useless fq clauses.

Best,
Erick

On Tue, Jan 15, 2019 at 9:33 AM Gael Jourdan-Weil
 wrote:
>
> @Erick:
>
>
> We will try to lower the autowarm and run some tests to compare.
>
> If I get your point, having a big cache might cause more troubles than help 
> if the cache hit ratio is not high enough because the cache is constantly 
> evicting/inserting entries?
>
>
>
> @Jeremy:
>
>
> Index size: ~20G and ~14M documents
>
> Server memory available: 256G from which ~30G used and ~100G system cache
>
> Server CPU count: 32, ~10% usage
>
> JVM memory settings: -Xms12G -Xmx12G
>
>
> We have 3 servers and 3 clusters of 3 Solr instances.
>
> That is each server hosts 1 Solr instance for each cluster.
>
> And, indeed, each cluster only has 1 shard with replication factor 3.
>
>
> Among all these Solr instances, the pauses are observed on only one single 
> cluster but on every server at different times (sometimes on all servers at 
> the same time but I would say it's very rare).
>
> We do observe the traffic is evenly balanced across the 3 servers, around 
> 30-40 queries per second sent to each server.
>
>
>
> Regards,
>
> Gaël
>
>
> 
> De : Branham, Jeremy (Experis) 
> Envoyé : mardi 15 janvier 2019 17:59:56
> À : solr-user@lucene.apache.org
> Objet : Re: Re: Delayed/waiting requests
>
> Hi Gael –
>
> Could you share this information?
> Size of the index
> Server memory available
> Server CPU count
> JVM memory settings
>
> You mentioned a cloud configuration of 3 replicas.
> Does that mean you have 1 shard with a replication factor of 3?
> Do the pauses occur on all 3 servers?
> Is the traffic evenly balanced across those servers?
>
>
> Jeremy Branham
> jb...@allstate.com
>
>
> On 1/15/19, 9:50 AM, "Erick Erickson"  wrote:
>
> Well, it was a nice theory anyway.
>
> "Other collections with the same settings"
> doesn't really mean much unless those other collections are very similar,
> especially in terms of numbers of docs.
>
> You should only see a new searcher opening when you do a
> hard-commit-with-opensearcher-true or soft commit.
>
> So what happens when you just try lowering the autowarm
> count? I'm assuming you're free to test in some non-prod
> system.
>
> Focusing on the hit ratio is something of a red herring. Remember
> that each entry in your filterCache is roughly maxDoc/8 + a little
> overhead, the increase in GC pressure has to be balanced
> against getting the hits from the cache.
>
> Now, all that said if there's no correlation, then you need to put
> a profiler on the system when you see this kind of thing and
> find out where the hotspots are, otherwise it's guesswork and
> I'm out of ideas.
>
> Best,
> Erick
>
> On Tue, Jan 15, 2019 at 12:06 AM Gael Jourdan-Weil
>  wrote:
> >
> > Hi Erick,
> >
> >
> > Thank you for your detailed answer, I better understand autowarming.
> >
> >
> > We have an autowarming time of ~10s for filterCache (queryResultCache 
> is not used at all, ratio = 0.02).
> >
> > We increased the size of the filterCache from 6k to 12k (and 
> autowarming size set to same values) to have a better ratio which is _only_ 
> around 0.85/0.90.
> >
> >
> > The thing I don't understand is I should see "Opening new searcher" in 
> the logs everytime a new searcher is opened and thus an autowarming happens, 
> right?
> >
> > But I don't see "Opening new searcher" very often, and I don't see it 
> being correlated with the response time peaks.
> >
> >
> > Also, I didn't mention it earlier but, we have other SolrCloud clusters 
> with similar settings and load (~10s filterCache

RE: join query and new searcher on joined collection

2019-01-15 Thread Vadim Ivanov

I see, thank you very much!

> -Original Message-
> From: Mikhail Khludnev [mailto:m...@apache.org]
> Sent: Tuesday, January 15, 2019 6:45 PM
> To: solr-user
> Subject: Re: join query and new searcher on joined collection
> 
> It doesn't invalidate anything. It just doesn't matches to the join query
> from older collection2 see
> https://github.com/apache/lucene-
> solr/blob/b7f99fe55a6fb6e7b38828676750b3512d6899a1/solr/core/src/java/o
> rg/apache/solr/search/JoinQParserPlugin.java#L570
> So, after commit collection2 following join at collection1 just won't hit
> filter cache, and will be cached as new entry and lately the old entry will
> be evicted.
> 
> On Tue, Jan 15, 2019 at 5:30 PM Vadim Ivanov <
> vadim.iva...@spb.ntk-intourist.ru> wrote:
> 
> > Thanx, Mikhail for reply
> > > collection1 has no idea about new searcher in collection2.
> > I suspected it. :)
> >
> > So, when "join" query arrives searcher on collection1 has no chance to use
> > filter cache, stored before.
> > I suppose it invalidates filter cache, am I right?
> >
> > &fq={!join score=none from=id fromIndex=collection2 to=field1}*:*
> >
> > > On Tue, Jan 15, 2019 at 1:18 PM Vadim Ivanov <
> > > vadim.iva...@spb.ntk-intourist.ru> wrote:
> > >
> > > > Sory, I've sent unfinished message
> > > > So, query on collection1
> > > > q=*:*{!join score=none from=id fromIndex=collection2 to=field1}*:*
> > > >
> > > > The question is what happened with autowarming and new searchers on
> > > > collection1 when new searcher starts on collection2?
> > > > IMHO when request with join comes it's impossible to use caches on
> > > > collection1 and ...
> > > > Does new searcher starts on collection1 as well?
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Vadim Ivanov [mailto:vadim.iva...@spb.ntk-intourist.ru]
> > > > > Sent: Tuesday, January 15, 2019 1:00 PM
> > > > > To: solr-user@lucene.apache.org
> > > > > Subject: join query and new searcher on joined collection
> > > > >
> > > > > Solr 6.3
> > > > >
> > > > >
> > > > >
> > > > > I have a query like this:
> > > > >
> > > > > q=*:*{!join score=none from=id fromIndex=hss_4 to=rpk_hdquotes
> > > v=$qq}*:*
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Vadim
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> >
> >
> 
> --
> Sincerely yours
> Mikhail Khludnev

Re: Can Solr 4.10 work with JDK11

2019-01-15 Thread Pushkar Raste

Or let me rephrase the question. What is the minimum Solr version that is
JDK11 compatible.

On Tue, Jan 15, 2019 at 10:27 AM Pushkar Raste 
wrote:

> I probably already know the answer for this but was still wondering.
>

Re: Delayed/waiting requests

2019-01-15 Thread Shawn Heisey


On 1/15/2019 10:33 AM, Gael Jourdan-Weil wrote:

Index size: ~20G and ~14M documents

Server memory available: 256G from which ~30G used and ~100G system cache

Server CPU count: 32, ~10% usage

JVM memory settings: -Xms12G -Xmx12G


Can you create a process listing screenshot as described at this URL?  
You'll need to use a file sharing website to provide us with a URL to 
access the file.  When done properly, the screenshot provides a lot of 
useful information.


https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue

It would be best if the screenshot is gathered when you're experiencing 
the problem.


Thanks,
Shawn

Re: Logging fails when starting Solr in Windows using solr.cmd

2019-01-15 Thread Oskar

I faced the same issue as jakob with solr-7.6.0, eclipse-2018-12 (4.10.0),
Java 1.8.0_191:

*Solution:*
In eclipse Run Configuration run-solr
remove "file:" from Argument
-Dlog4j.configurationFile="file:${workspace_loc:solr-7.6.0}/solr/server/resources/log4j2.xml"




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: Delayed/waiting requests

join query and new searcher on joined collection

RE: join query and new searcher on joined collection

Re: "no servers hosting shard" when querying during shard creation

Re: DateRangeField requires month?

Re: DateRangeField requires month?

Re: join query and new searcher on joined collection

Re: DateRangeField requires month?

RE: join query and new searcher on joined collection

Re: join query and new searcher on joined collection

Re: Delayed/waiting requests

Re: Re: Delayed/waiting requests

RE: Re: Delayed/waiting requests

Can Solr 4.10 work with JDK11

Re: Re: Delayed/waiting requests

RE: join query and new searcher on joined collection

Re: Can Solr 4.10 work with JDK11

Re: Delayed/waiting requests

Re: Logging fails when starting Solr in Windows using solr.cmd

19 matches

Site Navigation

Mail list logo

Footer information