Hello,

I'm investigating an 8 nodes Solr 7.2.1 cluster because we've a lot of
problems, like when a node fails to import from a DB (maybe it freeze), the
entire cluster goes down, and other like the leader wont change even when
is down (all nodes detects that is down but no leader election is
triggered), and similar problems. Every few days we've to recover the
cluster because becomes inestable and goes down.

The last problem that I've got, is three collections that have nodes on
"recovery" state from a lot of hours, and the log shows an error telling
that "leader node is not the leader" so I'm trying to change the leader.
After shutting down the "leader" (detected by the other nodes as down and
waiting about 20 minutes), trying REBALANCELEADER and FORCELEADER, I'm
unable to change the leader on the cluster, and that's why started to see
on ZooKeeper. The problem I've seen on Zookeeper is that Leaders are
different than Solr admin cluster info, so Maybe that's why the nodes are
unable to connect to real leader and cannot end the recovery.

The entire cluster and ZK has the traffic open to avoid problems (the VPC
is private), so is not a connection problem.

Is there any way to sync the leader info between solr and ZK?, also I want
to know if exists a way to force to change the leader (FORCELEADER don't
work when the solr denies to change the leader, because it say that a
leader exists).

Thanks!
-- 
_________________________________________

      Daniel Carrasco Marín
      Ingeniería para la Innovación i2TIC, S.L.
      Tlf:  +34 911 12 32 84 Ext: 223
      www.i2tic.com
_________________________________________

Reply via email to