On 6/2/2014 8:24 AM, Jean-Sebastien Vachon wrote: > We have yet to determine where the exact breaking point is. > > The two patterns we are seeing are: > > - less cache (around 20-30% hit/ratio), poor performance but > overall good stability
When caches are too small, a low hit ratio is expected. Increasing them is a good idea, but only increase them a little bit at a time. The filterCache in particular should not be increased dramatically, especially the autowarmCount value. Filters can take a very long time to execute, so a high autowarmCount can result in commits taking forever. Each filter entry can take up a lot of heap memory -- in terms of bytes, it is the number of documents in the core divided by 8. This means that if the core has 10 million documents, each filter entry (for JUST that core) will take over a megabyte of RAM. > - more cache (over 90% hit/ratio), improved performance but > almost no stability. In that case, we start seeing messages such as > "No shards hosting shard X" or "cancelElection did not find election > node to remove" This would not be a direct result of increasing the cache size, unless perhaps you've increased them so they are *REALLY* big and you're running out of RAM for the heap or OS disk cache. > Anyone, has any advice on what could cause this? I am beginning to > suspect the JVM version, is there any minimal requirements regarding > the JVM? Oracle Java 7 is recommended for all releases, and required for Solr 4.8. You just need to stay away from 7u40, 7u45, and 7u51 because of bugs in Java itself. Right now, the latest release is recommended, which is 7u60. The 7u21 release that you are running should be perfectly fine. With six 9.4GB cores per node, you'll achieve the best performance if you have about 60GB of RAM left over for the OS disk cache to use -- the size of your index data on disk. You did mention that you have 92GB of RAM per node, but you have not said how big your Java heap is, or whether there is other software on the machine that may be eating up RAM for its heap or data. http://wiki.apache.org/solr/SolrPerformanceProblems Thanks, Shawn