Re: SolrCloud scaling/optimization for high request rate

Erick Erickson Mon, 29 Oct 2018 08:21:31 -0700

Speaking of your caches... Either it's a problem with the metrics
reporting or your warmup times very, very long. 11 seconds and, er,
52 seconds! My guess is that you have your autowarm counts set to a
very high number and are consuming a lot of CPU time every time a
commit happens. Which will only happen when indexing. I usually start
autowarms for these caches at < 20.

Quick note on autowarm: These caches are a map with the key being the
query and the value being some representation of the docs that satisfy
it. Autowarming just replays the most recently used N of these.
documentCache can't be autowarmed, so we can ignore it.

So in your case, the main value of the queryResultCache is to read
into memory all of the parts of the index to satisfy them, including,
say, the sort structures (docValues), the index terms and, really,
whatever is necessary. Ditto for the filterCache.

The queryResultCache was originally intended to support paging, it
only holds a few doc IDs per query. Memory wise it's pretty
insignificant. Your hit ratio indicates you're not paging. All that
said, the autowarm bits much more important so I wouldn't  disable it
entirely.

Each filterCache entry is bounded by maxDoc/8 size-wise (plus some
extra, but that's the number that usually counts). It may be smaller
for sparse result sets but we can ignore that for now. You usually
want this as small as possible and still get a decent hit ratio.

The entire purpose of autowarm is so that the _first_ query that's run
after a commit (hard with openSearcher=true or soft) isn't noticeably
slower due to having to initially load parts of the index into memory.
As the autowarm count goes up you pretty quickly hit diminishing
returns.

Now, all that may not be the actual problem, but here's a quick way to test:

turn your autowarm counts off. What you should see is a correlation
between when a commit happens and a small spike in response time for
the first few queries, but otherwise a better query response profile.
If that's true, try gradually increasing the autowarm count 10 at a
time. My bet: If this is germane, you'll pretty soon see no difference
in response times as you increase your autowarm count. I.e. there'll
be no noticeable difference between 20 and 30 for instance. And your
autowarm times will be drastically smaller. And most of the CPU you're
expending to autowarm will be freed up to actually satisfy use
queries.

If any of that speculation is borne out, you have something that'll
help. Or you have another blind alley ;)

Best
Erick

On Mon, Oct 29, 2018 at 8:00 AM Sofiya Strochyk <s...@interlogic.com.ua> wrote:
>
> Hi Deepak and thanks for your reply,
>
>
> On 27.10.18 10:35, Deepak Goel wrote:
>
>
> Last, what is the nature of your request. Are the queries the same? Or they 
> are very random? Random queries would need more tuning than if the queries 
> the same.
>
> The search term (q) is different for each query, and filter query terms (fq) 
> are repeated very often. (so we have very little cache hit ratio for query 
> result cache, and very high hit ratio for filter cache)
>
> --
> Sofiia Strochyk
>
>
>
> s...@interlogic.com.ua
>
> www.interlogic.com.ua
>
>

Re: SolrCloud scaling/optimization for high request rate

Reply via email to