On 11/3/2020 11:46 PM, raj.yadav wrote:
We have two parallel system one is  solr 8.5.2 and other one is solr 5.4
In solr_5.4 commit time with opensearcher true is 10 to 12 minutes while in
solr_8 it's around 25 minutes.

Commits on a properly configured and sized system should take a few seconds, not minutes. 10 to 12 minutes for a commit is an enormous red flag.

This is our current caching policy of solr_8

<filterCache class="solr.CaffeineCache"
                  size="32768"
                  initialSize="6000"
                  autowarmCount="6000"/>

This is probably the culprit. Do you know how many entries the filterCache actually ends up with? What you've said with this config is "every time I open a new searcher, I'm going to execute up to 6000 queries against the new index." If each query takes one second, running 6000 of them is going to take 100 minutes. I have seen these queries take a lot longer than one second.

Also, each entry in the filterCache can be enormous, depending on the number of docs in the index. Let's say that you have five million documents in your core. With five million documents, each entry in the filterCache is going to be 625000 bytes. That means you need 20GB of heap memory for a full filterCache of 32768 entries -- 20GB of memory above and beyond everything else that Solr requires. Your message doesn't say how many documents you have, it only says the index is 11GB. From that, it is not possible for me to figure out how many documents you have.

While debugging this we came across this page.
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-Slowcommits

I wrote that wiki page.

Here one of the reasons for slow commit is mentioned as:
*/`Heap size issues. Problems from the heap being too big will tend to be
infrequent, while problems from the heap being too small will tend to happen
consistently.`/*

Can anyone please help me understand the above point?

If your heap is a lot bigger than it needs to be, then what you'll see is slow garbage collections, but it won't happen very often. If the heap is too small, then there will be garbage collections that happen REALLY often, leaving few system resources for actually running the program. This applies to ANY Java program, not just Solr.

System config:
disk size: 250 GB
cpu: (8 vcpus, 64 GiB memory)
Index size: 11 GB
JVM heap size: 30 GB

That heap seems to be a lot larger than it needs to be. I have run systems with over 100GB of index, with tens of millions of documents, on an 8GB heap. My filterCache on each core had a max size of 64, with an autowarmCount of four ... and commits STILL would take 10 to 15 seconds, which I consider to be very slow. Most of that time was spent executing those four queries in order to autowarm the filterCache.

What I would recommend you start with is reducing the size of the filterCache. Try a size of 128 and an autowarmCount of 8, see what you get for a hit rate on the cache. Adjust from there as necessary. And I would reduce the heap size for Solr as well -- your heap requirements should drop dramatically with a reduced filterCache.

Thanks,
Shawn

Reply via email to