I'm running 4.3 in solrcloud mode and trying to test index recovery, but it's failing. I have one shard, 2 replicas: Leader: 10.159.8.105 Replica: 10.159.6.73
To test, I stopped the replica, deleted the 'data' directory and restarted solr. Here is the replica's logging: INFO - 2013-10-25 12:19:40.773; org.apache.solr.cloud.ZkController; We are http://10.159.6.73:8983/solr/collection/ and leader is http://10.159.8.105:8983/solr/collection/ INFO - 2013-10-25 12:19:40.774; org.apache.solr.cloud.ZkController; No LogReplay needed for core=collection baseURL=http://10.159.6.73:8983/solr INFO - 2013-10-25 12:19:40.774; org.apache.solr.cloud.ZkController; Core needs to recover:collection INFO - 2013-10-25 12:19:40.774; org.apache.solr.update.DefaultSolrCoreState; Running recovery - first canceling any ongoing recovery INFO - 2013-10-25 12:19:40.778; org.apache.solr.cloud.RecoveryStrategy; Starting recovery process. core=collection recoveringAfterStartup=true ... ERROR - 2013-10-25 12:20:25.281; org.apache.solr.common.SolrException; Error while trying to recover. core=collection:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: I was asked to wait on state recovering for 10.159.6.73:8983_solr but I still do not see the requested state. I see state: down live:true ... ERROR - 2013-10-25 12:20:25.281; org.apache.solr.cloud.RecoveryStrategy; Recovery failed - trying again... (5) core=collection ERROR - 2013-10-25 12:20:25.281; org.apache.solr.common.SolrException; Recovery failed - interrupted. core=collection ERROR - 2013-10-25 12:20:25.282; org.apache.solr.common.SolrException; Recovery failed - I give up. core=collection INFO - 2013-10-25 12:20:25.282; org.apache.solr.cloud.ZkController; publishing core=collection state=recovery_failed Here is the Leader's logging: INFO - 2013-10-25 12:19:40.883; org.apache.solr.handler.admin.CoreAdminHandler; Going to wait for coreNodeName: 10.159.6.73:8983_solr_collection, state: recovering, checkLive: true, onlyIfLeader: true INFO - 2013-10-25 12:19:55.886; org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from ZooKeeper... ERROR - 2013-10-25 12:20:25.277; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: I was asked to wait on state recovering for 10.159.6.73:8983_solr but I still do not see the requested state. I see state: down live:true (repeats every minute) Is it valid to simply delete the 'data' directory, or does a znode have to be modified, too? What's the right way to reinitialize and re-synch a core? Peter