We recently experienced a case where zookeeper snapshot became corrupt and
would not restart. 
zkCli.sh (of zookeeper) would fail with an error unable to connect to /

We have a solr cloud with two shards (Keys are autosharded) (Solr version
4.10.1)

Unfortunately, we did not have a good snapshot to recover from. We are
planning on creating a brand new zookeeper ensemble and have the solr nodes
reconnect. We do not have a good clusterstate.json to upload to zookeeper.

Our current state is - all solr nodes are operating on read-only mode. No
updates are possible. 

This is what we are planning on doing now:
1. Delete snapshot and logs from zookeepers
2. Create brand new data folder
3. Upload solr configurations into zookeepers
4. With solr nodes running, have them reconnect to zookeeper.

What I am not clear is, will each solr node as it attempts to reconnect -
identify itself as which shard it originally belonged to. Will the
clusterstate.json get created? I don't know the hash ranges since there is
no clusterstate.json. Or do I need to manually create a clusterstate.json
and upload it to the zookeeper.

What is our best recourse now. Any help with disaster recovery is much
appreciated.

Thanks,
Pramod

 






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-with-recovering-shard-range-after-zookeeper-disaster-tp4284645.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to