Why so many shards?

> On May 10, 2020, at 9:09 PM, Ganesh Sethuraman <ganeshmail...@gmail.com> 
> wrote:
> 
> We are using dedicated host, Cent OS in EC2  r5.12xlarge (48  CPU, ~360GB
> RAM), 2 nodes. Swapiness set to 1. With General purpose 2T EBS SSD volume.
> JVM size of 18gb, with G1 GC enabled. About 92 collection with average of 8
> shards and 2 replica each. Most of updates over daily batch updates.
> 
> While we have Solr disk utilization of about ~800gb. Most of the collection
> space are for real time GET, /get call. The issue we are having is for few
> collection where we having query use case /need. This has 32 replica (16
> shards 2 replica each). During performance test, issue is few calls where
> we have high response time, it is noticeable when test duration is small,
> the response time improve when the test is for longer duration.
> 
> Hope this information helps.
> 
> Regards
> Ganesh
> 
> Regards
> Ganesh
> 
> 
>> On Sun, May 10, 2020, 8:14 PM Shawn Heisey <apa...@elyograg.org> wrote:
>> 
>>> On 5/10/2020 4:48 PM, Ganesh Sethuraman wrote:
>>> The additional info is that when we execute the test for longer (20mins)
>> we
>>> are seeing better response time, however for a short test (5mins) and
>> rerun
>>> the test after an hour or so we are seeing slow response times again.
>> Note
>>> that we don't update the collection during the test or in between the
>> test.
>>> Does this help to identify the issue?
>> 
>> Assuming Solr is the only software that is running, most operating
>> systems would not remove Solr data from the disk cache, so unless you
>> have other software running on the machine, it's a little weird that
>> performance drops back down after waiting an hour.  Windows is an
>> example of an OS that *does* proactively change data in the disk cache,
>> and on that OS, I would not be surprised by such behavior.  You haven't
>> mentioned which OS you're running on.
>> 
>>> 3. We have designed our test to mimick reality where filter cache is not
>>> hit at all. From solr, we are seeing that there is ZERO Filter cache hit.
>>> There is about 4% query and document cache hit in prod and we are seeing
>> no
>>> filter cache hit in both QA and PROD
>> 
>> If you're getting zero cache hits, you should disable the cache that is
>> getting zero hits.  There is no reason to waste the memory that the
>> cache uses, because there is no benefit.
>> 
>>> Give that, could this be some warming up related issue to keep the Solr /
>>> Lucene memory-mapped file in RAM? Is there any way to measure which
>>> collection is using memory? we do have 350GB RAM, but we see it full with
>>> buffer cache, not really sure what is really using this memory.
>> 
>> You would have to ask the OS which files are contained by the OS disk
>> cache, and it's possible that even if the information is available, that
>> it is very difficult to get.  There is no way Solr can report this.
>> 
>> Thanks,
>> Shawn
>> 

Reply via email to