Hi Olivier, Can you look at the collections to see if there are leader initiated recovery nodes in the ZooKeeper tree? Go into the Solr Admin UI -> Cloud panel -> Tree view and drill into one of the collections that's not recovering /collections/<collection>/leader_initiated_recovery/
You could try deleting the znodes for one of the shards under that and see if that shard recovers. Let me know what shakes out as there still may be a bug in this area of the recovery logic. On Wed, Jul 29, 2015 at 1:49 PM, Olivier Damiot <olivier.dam...@gmail.com> wrote: > Hello everybody, > > I use solr 5.2.1 and am having a big problem. > I have about 1200 collections, 3 shards, replicationfactor = 3, > MaxShardPerNode=3. > I have 3 boxes of 64G (32 JVM). > I have no problems with the creation of collection or indexing, but when I > lose a node (VMY full or kill) and I restart, all my collections are down. > I look in the logs I can see problems of leader election, eg: > - Checking if I (core = test339_shard1_replica1, coreNodeName = > core_node5) shoulds try and be the leader. > - Cloud says we are still state leader. > > I feel that all server pass the buck! > > I do not understand this error especially as if I read the mailing list I > have the impression that this bug is solved long ago. > > what should I do to start my collections properly? > > Is someone could help me ? > > thank you a lot > > Olivier