After none of the JVM configuration options helped witH GC, as Erick suggested I took a heap dump of one of the misbehaving slaves and analysis shows that fieldcache is using a large amount of the the total heap.
Memory Analyzer output: One instance of "org.apache.solr.uninverting.FieldCacheImpl" loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x7f60f7b38658" occupies 61,234,712,560 (91.86%) bytes. The memory is accumulated in one instance of "java.util.HashMap$Node[]" loaded by "<system class loader>". Hypotheses: Without regular indexing, commits are not happening, so the searcher is not being re opened, and field cache is not being reset. Since there is only one instance of this field cache, it is a live object and not being cleaned up in GC. But I also noticed that the fieldcache entries on solr UI have the same entries for all collections on that solr instance. Ques 1. Is the field cache reset on commit? If so, is it reset when any of the collections are committed? Or it is not reset at all and I'm missing something here? Ques 2. Is there a way to reset/delete this cache every x minutes (the current autocommit duration) irrespective of whether documents were added or not? Other than this, I think the reason for huge heap usage (as others have pointed out) is that we are not using docValues for any of the fields, and we use a large number of fields in sorting functions (between 15-20 over all queries combined). As the next step on this front, I will add new fields with docvalues true and reindex the entire collection. Hopefully that will help. We use quite a few dynamic fields in sorting. There is no mention of using docvalues with dynamic fields in the official documentation (https://lucene.apache.org/solr/guide/6_6/docvalues.html). Ques 3. Do docvalues work with dynamic fields or not? If they do, anything in particular that I should look out for, like the cardinality of the field (ie number of different x's in example_dynamic_field_x)? Shawn, I've uploaded my configuration files for the two collections here: https://ufile.io/u6oe0 (tar -zxvf c1a_confs.tar.gz to decompress) c1 collection is ~10GB when optimized, and has 2.5 million documents. ca collection is ~2GB when optimized, and has 9.5 million documents. Please let me know if you think there is something amiss in the configuration that I should fix. Thanks Yasoob -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html