Hi there I have a 6 node solrcloud cluster with 50 collections. All collections are sharded across all the 6 nodes. I am seeing a weird behavior where both the replicas for a shard go to down to go to a "recovering" state and never come back (No specific corelation to writes or reads).
I manually am unloading and recreating the cores to band aid the problem In the solr logs I see this.. org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/cores params={coreNodeName=<ip>:8983_solr_testcollection_shard1_replica1&state=recovering&nodeName=<ip>:8983_solr&action=PREPRECOVERY&checkLive=true&core=solr_testcollection_shard1_replica2&wt=javabin&onlyIfLeader=true&version=2} status=0 QTime=99 Have any of you seen this issue before? Is it a known bug that can be fixed with an upgrade? Should i increase the zookeeper timeout may be? Any pointers are much appreciated Thanks Veera