Hello I have a 4 node cluster running Solr cloud 4.3.1. I have a few large collections sharded 8 ways across all the 4 nodes (with 2 shards per node). The size of the shard for the large collections is around 600-700Mb containing around 250K+ documents.
Currently the size of the query cache is around 512. We have a few jobs that run tail queries on these collections. The hit ratio of the cache drops to 0 when running these queries and also at the same time CPU spikes. The latencies are in the order of seconds in the above case. I verified GC behavior is normal (not killing cpu) The following are my questions 1. Is it a good practice to vary the Query Result Cache size based on the size of the collection (large collections have large cache)? 2. If most of your queries are tail queries, what is a good way to make your cache usage effective (higher hits) 3. If lets say all your queries miss the cache, it is an OK behavior if your CPU spikes (to 90+%) 4. Is there a recommended shard size (# of doc, size ) to use. A few of my collections are 100-200 Mb and the large ones are in teh order of 800-1Gb Thanks a lot in advance Nitin