We are using dedicated host, Cent OS in EC2 r5.12xlarge (48 CPU, ~360GB RAM), 2 nodes. Swapiness set to 1. With General purpose 2T EBS SSD volume. JVM size of 18gb, with G1 GC enabled. About 92 collection with average of 8 shards and 2 replica each. Most of updates over daily batch updates.
While we have Solr disk utilization of about ~800gb. Most of the collection space are for real time GET, /get call. The issue we are having is for few collection where we having query use case /need. This has 32 replica (16 shards 2 replica each). During performance test, issue is few calls where we have high response time, it is noticeable when test duration is small, the response time improve when the test is for longer duration. Hope this information helps. Regards Ganesh Regards Ganesh On Sun, May 10, 2020, 8:14 PM Shawn Heisey <apa...@elyograg.org> wrote: > On 5/10/2020 4:48 PM, Ganesh Sethuraman wrote: > > The additional info is that when we execute the test for longer (20mins) > we > > are seeing better response time, however for a short test (5mins) and > rerun > > the test after an hour or so we are seeing slow response times again. > Note > > that we don't update the collection during the test or in between the > test. > > Does this help to identify the issue? > > Assuming Solr is the only software that is running, most operating > systems would not remove Solr data from the disk cache, so unless you > have other software running on the machine, it's a little weird that > performance drops back down after waiting an hour. Windows is an > example of an OS that *does* proactively change data in the disk cache, > and on that OS, I would not be surprised by such behavior. You haven't > mentioned which OS you're running on. > > > 3. We have designed our test to mimick reality where filter cache is not > > hit at all. From solr, we are seeing that there is ZERO Filter cache hit. > > There is about 4% query and document cache hit in prod and we are seeing > no > > filter cache hit in both QA and PROD > > If you're getting zero cache hits, you should disable the cache that is > getting zero hits. There is no reason to waste the memory that the > cache uses, because there is no benefit. > > > Give that, could this be some warming up related issue to keep the Solr / > > Lucene memory-mapped file in RAM? Is there any way to measure which > > collection is using memory? we do have 350GB RAM, but we see it full with > > buffer cache, not really sure what is really using this memory. > > You would have to ask the OS which files are contained by the OS disk > cache, and it's possible that even if the information is available, that > it is very difficult to get. There is no way Solr can report this. > > Thanks, > Shawn >