Hi,

I have a strange situation. I created a collection with 4 ndoes (separate servers, numShards=4), I then proceeded to index data ... all has been seemingly well until this morning when I had to reboot one of the nodes.

After reboot, the node I rebooted went into recovery mode! This is completely illogical as there is 1 shard per node (no replicas).

What could have possibly happened to 1) trigger a recovery and; 2) have the node think it has a replica to even recover from?

Looking at the graph from the SOLR admin page it shows that shard1 disappeared and the server that was rebooted appears in a recovering state under the server home to shard2.

I then looked at clusterstate.json and it confirms that shard1 is completely missing and shard2 now has a replica. ... I'm baffled, confused, dismayed.

Versions:
Solr 4.4 (4 nodes with tomcat container)
zookeeper-3.4.5 (5-node ensemble)

Oh, and I'm assuming shard1 is completely corrupt.

I'd really appreciate any insight.

David

PS I have a copy of all the shards backed up. Is there a way to possibly rsync shard1 back into place and "fix" clusterstate.json manually?

Reply via email to