Hello

  I have a 4 node cluster running Solr cloud 4.3.1. I have a few large
collections sharded 8 ways across all the 4 nodes (with 2 shards per node).
The size of the shard for the large collections is around 600-700Mb
containing around 250K+ documents.

Currently the size of the query cache is around 512. We have a few jobs
that run tail queries on these collections. The hit ratio of the cache
drops to 0 when running these queries and also at the same time CPU spikes.
The latencies are in the order of seconds in the above case. I verified GC
behavior is normal (not killing cpu)

The following are my questions


   1. Is it a good practice to vary the Query Result Cache size based on
   the size of the collection (large collections have large cache)?
   2. If most of your queries are tail queries, what is a good way to make
   your cache usage effective (higher hits)
   3. If lets say all your queries miss the cache, it is an OK behavior if
   your CPU spikes (to 90+%)
   4. Is there a recommended shard size (# of doc, size ) to use. A few of
   my collections are 100-200 Mb and the large ones are in teh order of 800-1Gb

Thanks a lot in advance
Nitin

Reply via email to