You're right, we're basically working around inherent problems. SolrCloud and large numbers of cores is not a combination that yields reliable restarts. Even under the best of conditions - a completely silent (no updates, no selects) environment - if I restart two nodes, each containing ~800 replicas, I am not confident that all collections will spin up.
SOLR-5990 documents one such case where both replicas in many collections end up in a recovery_failed state. -- View this message in context: http://lucene.472066.n3.nabble.com/Collection-loadOnStartup-tp4082531p4131766.html Sent from the Solr - User mailing list archive at Nabble.com.