Hi everyone, I ran into a situation in which a portion of the nodes in a Solr 
cluster were stuck in the GONE state (no ephemeral znode under /live_nodes) 
despite the Solr processes being alive and connected to the ZooKeeper tier. For 
context this a cluster running Solr 7.7.1 and ZooKeeper 3.4.13 (being monitored 
by Exhibitor 1.7.1).

The events that led to this situation, in the order that they occurred:


  *   We have three ZooKeeper nodes: A, B, and C. A is the leader of the 
ensemble.
  *   ZooKeeper A becomes partitioned from ZooKeeper B and C and the Solr tier.
  *   Some Solr nodes log “zkclient has disconnected” warnings and ZooKeeper A 
expires some Solr client sessions due to timeouts. The partition between 
Zookeeper A and the Solr tier ends and Solr nodes that were connected to 
ZooKeeper A attempt to renew their sessions but are told their sessions have 
expired. [1]
     *   Note that I’m simplifying: some nodes that were connected to ZooKeeper 
A were able to move their sessions to ZooKeeper B/C before their session 
expired. [2]
  *   ZooKeeper A realizes it is not synced with ZooKeeper B and C and closes 
connections with Solr nodes and, apparently, remains partitioned from B/C.
  *   ZooKeeper B and C eventually elect ZooKeeper B as the leader and start 
accepting writes requests as they form a quorum.
  *   Solr nodes previously connected to ZooKeeper that had their sessions 
expire now connect to ZooKeeper B and C, they successfully publish their state 
as DOWN, and then attempt to write to /live_nodes to signal that they’re 
reconnected to ZooKeeper.
  *   The writes of the ephemeral znodes to /live_nodes fail with NodeExists 
exceptions [3]. The failed writes are logged on ZooKeeper B. [4]
     *   It looks like a failure mode of “leader becomes partitioned and 
ephemeral znode deletions are not processed by followers” is documented on 
ZOOKEEPER-2348<https://jira.apache.org/jira/browse/ZOOKEEPER-2348>.
  *   ZooKeeper A eventually rejoins the ensemble and the /live_nodes entries 
that expired after the initial partition are removed when session expirations 
are reprocessed on the new leader (ZooKeeper B) [5]
  *   The Solr nodes whose attempts at writing to /live_nodes failed never try 
again and remain in the GONE state for 6+ hours.


Restarting the GONE Solr nodes made them reconnect to ZooKeeper and everything 
recovered just fine after that
My questions are:

  *   Is the sequence of events I’ve laid out plausible? Is there anything that 
stands out as being “wrong” that I should look more in to?
  *   If not, is there anything I can/should be doing to prevent something like 
happening again or at least reducing the probability of its occurrence? Or is 
this just a failure scenario that’s not being handled at the moment and it 
merits a ticket?

I wasn’t able to find any mailing list thread or ticket that discussed a 
similar issue though I’m also not confident my search was exhaustive.☺ 
Apologies if this is documented somewhere.

Thanks!

Marko

[1] Log of ZooKeeper A expiring a Solr node session and not allowing the node 
to renew it:

2019-04-13 06:58:23.997 [SessionTracker] INFO  
org.apache.zookeeper.server.ZooKeeperServer  – Expiring session 
0x7000014eb60267d, timeout of
15000ms exceeded
…
2019-04-13 06:58:24.015 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO  
org.apache.zookeeper.server.ZooKeeperServer  – Client attempting to renew 
session 0x7000014eb60267d at /XXX.XXX.XXX.XXX:54792
2019-04-13 06:58:24.015 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO  
org.apache.zookeeper.server.ZooKeeperServer  – Invalid session 
0x7000014eb60267d for client /XXX.XXX.XXX.XXX:54792, probably expired
…
2019-04-13 06:58:24.017 [ProcessThread(sid:9 cport:-1):] INFO  
org.apache.zookeeper.server.PrepRequestProcessor  – Processed session 
termination for sessionid: 0x7000014eb60267d

[2] Solr node moving its session from ZooKeeper A to ZooKeeper B/C:

Solr:

2019-04-13 06:58:15.134000 [main-SendThread(zookeeper03.localdomain:2181)] WARN 
 org.apache.zookeeper.ClientCnxn [] u2013 Client session timed out, have not 
heard from server in 10005ms for sessionid 0x900000676eb264b

ZooKeeper A:

2019-04-13 06:58:24.011 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] WARN  
org.apache.zookeeper.server.NIOServerCnxn  – Unable to read additional data 
from client sessionid 0x900000676eb264b, likely client has closed socket
2019-04-13 06:58:24.011 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO  
org.apache.zookeeper.server.NIOServerCnxn  – Closed socket connection for 
client /XXX.XXX.XXX.XXX:55750 which had sessionid 0x900000676eb264b
…
2019-04-13 06:58:24.018 [ProcessThread(sid:9 cport:-1):] INFO  
org.apache.zookeeper.server.PrepRequestProcessor  – Got user-level 
KeeperException when processing sessionid:0x900000676eb264b type:ping 
cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 
Error Path:null Error:KeeperErrorCode = Session moved

ZooKeeper B:

2019-04-13 06:58:16.068 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO  
org.apache.zookeeper.server.ZooKeeperServer  – Client attempting to renew 
session 0x900000676eb264b at /XXX.XXX.XXX.XXX:32836
2019-04-13 06:58:16.068 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO  
org.apache.zookeeper.server.quorum.Learner  – Revalidating client: 
0x900000676eb264b


[3] Example log lines + stack trace of /live_nodes write failure:

2019-04-13 06:58:24.022000 [main-SendThread(zookeeper03.localdomain:2181)] WARN 
 org.apache.zookeeper.ClientCnxn [] u2013 Unable to reconnect to ZooKeeper 
service, session ≈ has expired
2019-04-13 06:59:01.082000 [zkConnectionManagerCallback-6-thread-1] INFO  
org.apache.solr.cloud.ZkController [] u2013 ZooKeeper session re-connected ... 
refreshing core states after session expiration.
2019-04-13 06:59:01.101000 [zkConnectionManagerCallback-6-thread-1] INFO  
org.apache.solr.cloud.ZkController [] u2013 Publish 
node=XXX.XXX.XXX.XXX:8983_solr as DOWN
2019-04-13 06:59:01.105000 [zkConnectionManagerCallback-6-thread-1] INFO  
org.apache.solr.cloud.ZkController [] u2013 Register node as live in 
ZooKeeper:/live_nodes/XXX.XXX.XXX.XXX:8983_solr
2019-04-13 06:59:01.108000 [zkConnectionManagerCallback-6-thread-1] ERROR 
org.apache.solr.cloud.ZkController [] u2013 
:org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
NodeExists
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:122)
        at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1015)
        at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:919)
        at 
org.apache.solr.common.cloud.SolrZkClient.lambda$multi$9(SolrZkClient.java:604)
        at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:71)
        at 
org.apache.solr.common.cloud.SolrZkClient.multi(SolrZkClient.java:604)
        at 
org.apache.solr.cloud.ZkController.createEphemeralLiveNode(ZkController.java:1089)
        at org.apache.solr.cloud.ZkController.access$500(ZkController.java:149)
        at org.apache.solr.cloud.ZkController$1.command(ZkController.java:376)
        at 
org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:177)
        at 
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:58)
        at 
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:151)
        at 
org.apache.solr.common.cloud.SolrZkClient$3.lambda$process$0(SolrZkClient.java:285)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

[4] Example log lines + stack trace of /live_nodes write failure in zk logs:

2019-04-13 06:59:01.405 [ProcessThread(sid:8 cport:-1):] INFO  
org.apache.zookeeper.server.PrepRequestProcessor  – Got user-level 
KeeperException when processing sessionid:0x7006dfdb0d00004 type:multi 
cxid:0x1a zxid:0x2700000035 txntype:-1 reqpath:n/a aborting remaining multi 
ops. Error Path:/live_nodes/10.17.129.123:8983_solr Error:KeeperErrorCode = 
NodeExists for /live_nodes/XXX.XXX.XXX.XXXx:8983_solr

[5] ZooKeeper B (new leader after ensemble partitions) processing old session 
expirations. Note that sessionId 0x7000014eb60267d is the same session that 
ZooKeeper A expired in [1].

2019-04-13 06:59:16.071 [SessionTracker] INFO  
org.apache.zookeeper.server.ZooKeeperServer  – Expiring session 
0x7000014eb60267d, timeout of 15000ms exceeded
…
2019-04-13 06:59:16.074 [ProcessThread(sid:8 cport:-1):] INFO  
org.apache.zookeeper.server.PrepRequestProcessor  – Processed session 
termination for sessionid: 0x7000014eb60267d


Reply via email to