We've got a 15 shard cluster spread across 3 hosts. This morning our puppet
software rebooted them all and afterwards the 'range' for each shard has
become null in zookeeper. Is there any way to restore this value short of
rebuilding a fresh index?

I've read various questions from people with a similar problem, although in
those cases it is usually a single shard that has become null allowing them
to infer what the value should be and manually fix it in ZK. In this case I
have no idea what the ranges should be. This is our test cluster, and
checking production I can see that the ranges don't appear to be
predictable based on the shard number.

I'm also not certain why it even occurred. Our test cluster only has a
single replica per shard, so when a JVM is rebooted the cluster is
unavailable... would that cause this? Production has 3 replicas so we can
do rolling reboots.

Reply via email to