On 11/3/2020 11:46 PM, raj.yadav wrote:
We have two parallel system one is solr 8.5.2 and other one is solr 5.4
In solr_5.4 commit time with opensearcher true is 10 to 12 minutes while in
solr_8 it's around 25 minutes.
Commits on a properly configured and sized system should take a few
seconds, not minutes. 10 to 12 minutes for a commit is an enormous red
flag.
This is our current caching policy of solr_8
<filterCache class="solr.CaffeineCache"
size="32768"
initialSize="6000"
autowarmCount="6000"/>
This is probably the culprit. Do you know how many entries the
filterCache actually ends up with? What you've said with this config is
"every time I open a new searcher, I'm going to execute up to 6000
queries against the new index." If each query takes one second, running
6000 of them is going to take 100 minutes. I have seen these queries
take a lot longer than one second.
Also, each entry in the filterCache can be enormous, depending on the
number of docs in the index. Let's say that you have five million
documents in your core. With five million documents, each entry in the
filterCache is going to be 625000 bytes. That means you need 20GB of
heap memory for a full filterCache of 32768 entries -- 20GB of memory
above and beyond everything else that Solr requires. Your message
doesn't say how many documents you have, it only says the index is 11GB.
From that, it is not possible for me to figure out how many documents
you have.
While debugging this we came across this page.
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-Slowcommits
I wrote that wiki page.
Here one of the reasons for slow commit is mentioned as:
*/`Heap size issues. Problems from the heap being too big will tend to be
infrequent, while problems from the heap being too small will tend to happen
consistently.`/*
Can anyone please help me understand the above point?
If your heap is a lot bigger than it needs to be, then what you'll see
is slow garbage collections, but it won't happen very often. If the
heap is too small, then there will be garbage collections that happen
REALLY often, leaving few system resources for actually running the
program. This applies to ANY Java program, not just Solr.
System config:
disk size: 250 GB
cpu: (8 vcpus, 64 GiB memory)
Index size: 11 GB
JVM heap size: 30 GB
That heap seems to be a lot larger than it needs to be. I have run
systems with over 100GB of index, with tens of millions of documents, on
an 8GB heap. My filterCache on each core had a max size of 64, with an
autowarmCount of four ... and commits STILL would take 10 to 15 seconds,
which I consider to be very slow. Most of that time was spent executing
those four queries in order to autowarm the filterCache.
What I would recommend you start with is reducing the size of the
filterCache. Try a size of 128 and an autowarmCount of 8, see what you
get for a hit rate on the cache. Adjust from there as necessary. And I
would reduce the heap size for Solr as well -- your heap requirements
should drop dramatically with a reduced filterCache.
Thanks,
Shawn