I have a SolrCloud index, 1 shard, with a leader and one replica, and 3 ZKs. The Solr indexes are behind a load balancer. There is one CloudSolrServer client updating the indexes. The index schema includes 3 ExternalFileFields. When the CloudSolrServer client issues a hard commit, I observe that the commits occur sequentially, not in parallel, on the leader and replica. The duration of each commit is about a minute. Most of this time is spent reloading the 3 ExternalFileField files. Because of the sequential commits, there is a period of time (1 minute+) when the index searchers will return different results, which can cause a bad user experience. This will get worse as replicas are added to handle auto-scaling. The goal is to keep all replicas in sync w.r.t. the user queries.
My questions: 1. Is there a reason that the distributed commits are done in sequence, not in parallel? Is there a way to change this behavior? 2. If instead, the commits were done in parallel by a separate client via a GET to each Solr instance, how would this client get the host/port values for each Solr instance from zookeeper? Are there any downsides to doing commits this way? Thanks, Peter