Re: SolrCloud scaling/optimization for high request rate

Sofiya Strochyk Thu, 16 May 2019 08:27:37 -0700

Thanks to everyone for the suggestions. We managed to get theperformance to a bearable level by splitting the index into ~20 separatecollections (one collection per country) and spreading them betweenexisting servers as evenly as possible. The largest country is alsosplit into 2 shards. This means that

1. QPS is lower for each instance since it only receives requests to thecorresponding country.

2. Index size is smaller for each instance as it only contains documentsfor the corresponding country.

3. If one instance fails then most of the other instances keep running(possibly except the ones colocated with the failed one)

We didn't make any changes to the main query, but have added a fewfields to facet on. This had a small negative impact on performance butoverall kept working nicely.



On 14.11.18 12:18, Toke Eskildsen wrote:

On Mon, 2018-11-12 at 14:19 +0200, Sofiya Strochyk wrote:

I'll check if the filter queries or the main query tokenizers/filters
might have anything to do with this, but I'm afraid query
optimization can only get us so far.

Why do you think that? As you tried eliminating sorting and retrieval
previously, the queries are all that's left. There are multiple
performance traps when querying and a lot of them can be bypassed by
changing the index or querying in a different way.

Since we will have to add facets later, the queries will only become
heavier, and there has to be a way to scale this setup and deal with
both higher load and more complex queries.

There is of course a way. It is more a question of what you are willing
to pay.

If you have money, just buy more hardware: We know (with very high
probability) that it will work as your problem is search throughput,
which can be solved by adding more replicas on extra machines.

If you have more engineering hours, you can use them on some of the
things discussed previously:

* Pinpoint query bottlenecks
* Use less/more shards
* Applyhttps://issues.apache.org/jira/browse/LUCENE-8374
* Experiment with different amounts of concurrent requests to see what
gives the optimum throughput. This also tells you how much extra
hardware you need, if you decide you need to expand..


- Toke Eskildsen, Royal Danish Library


--
Email Signature
*Sofiia Strochyk
*


s...@interlogic.com.ua <mailto:s...@interlogic.com.ua>
        InterLogic
www.interlogic.com.ua <https://www.interlogic.com.ua>

Facebook icon <https://www.facebook.com/InterLogicOfficial> LinkedInicon <https://www.linkedin.com/company/interlogic>

Re: SolrCloud scaling/optimization for high request rate

Reply via email to