Why so many shards?
> On May 10, 2020, at 9:09 PM, Ganesh Sethuraman <ganeshmail...@gmail.com> > wrote: > > We are using dedicated host, Cent OS in EC2 r5.12xlarge (48 CPU, ~360GB > RAM), 2 nodes. Swapiness set to 1. With General purpose 2T EBS SSD volume. > JVM size of 18gb, with G1 GC enabled. About 92 collection with average of 8 > shards and 2 replica each. Most of updates over daily batch updates. > > While we have Solr disk utilization of about ~800gb. Most of the collection > space are for real time GET, /get call. The issue we are having is for few > collection where we having query use case /need. This has 32 replica (16 > shards 2 replica each). During performance test, issue is few calls where > we have high response time, it is noticeable when test duration is small, > the response time improve when the test is for longer duration. > > Hope this information helps. > > Regards > Ganesh > > Regards > Ganesh > > >> On Sun, May 10, 2020, 8:14 PM Shawn Heisey <apa...@elyograg.org> wrote: >> >>> On 5/10/2020 4:48 PM, Ganesh Sethuraman wrote: >>> The additional info is that when we execute the test for longer (20mins) >> we >>> are seeing better response time, however for a short test (5mins) and >> rerun >>> the test after an hour or so we are seeing slow response times again. >> Note >>> that we don't update the collection during the test or in between the >> test. >>> Does this help to identify the issue? >> >> Assuming Solr is the only software that is running, most operating >> systems would not remove Solr data from the disk cache, so unless you >> have other software running on the machine, it's a little weird that >> performance drops back down after waiting an hour. Windows is an >> example of an OS that *does* proactively change data in the disk cache, >> and on that OS, I would not be surprised by such behavior. You haven't >> mentioned which OS you're running on. >> >>> 3. We have designed our test to mimick reality where filter cache is not >>> hit at all. From solr, we are seeing that there is ZERO Filter cache hit. >>> There is about 4% query and document cache hit in prod and we are seeing >> no >>> filter cache hit in both QA and PROD >> >> If you're getting zero cache hits, you should disable the cache that is >> getting zero hits. There is no reason to waste the memory that the >> cache uses, because there is no benefit. >> >>> Give that, could this be some warming up related issue to keep the Solr / >>> Lucene memory-mapped file in RAM? Is there any way to measure which >>> collection is using memory? we do have 350GB RAM, but we see it full with >>> buffer cache, not really sure what is really using this memory. >> >> You would have to ask the OS which files are contained by the OS disk >> cache, and it's possible that even if the information is available, that >> it is very difficult to get. There is no way Solr can report this. >> >> Thanks, >> Shawn >>