Hello I have a cluster of 16 shards, 3 replicas. the cluster indexed nested documents. it currently has 3 billion documents overall (parent and children). each shard has around 200 million docs. size of each shard is 250GB. this runs on 12 machines. each machine has 4 SSD disks and 4 solr processes. each process has 28GB heap. each machine has 196GB RAM.
I perform periodic indexing throughout the day. each indexing cycle adds around 1.5 million docs. I keep the indexing load light - 2 processes with bulks of 20 docs. My use case demands that each indexing cycle will be visible only when the whole cycle finishes. I tried various methods of using soft and hard commits: 1. using auto hard commit with time=10secs (opensearcher=false) and an explicit soft commit when the indexing finishes. 2. using auto soft commit with time=10/30/60secs during the indexing. 3. not using soft commit at all, just using auto hard commit with time=10secs during the indexing (opensearcher=false) and an explicit hard commit with opensearcher=true when the cycle finishes. with all methods I encounter pretty much the same problem: 1. heavy GCs when soft commit is performed (methods 1,2) or when hardcommit opensearcher=true is performed. these GCs cause heavy latency (average latency is 3 secs. latency during the problem is 80secs) 2. if indexing cycles come too often, which causes softcommits or hardcommits(opensearcher=true) occur with a small interval one after another (around 5-10minutes), I start getting many OOM exceptions. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068.html Sent from the Solr - User mailing list archive at Nabble.com.