jonmv commented on code in PR #1925:
URL: https://github.com/apache/zookeeper/pull/1925#discussion_r998888168
##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java:
##########
@@ -756,13 +760,21 @@ protected void syncWithLeader(long newLeaderZxid) throws
Exception {
zk.startupWithoutServing();
if (zk instanceof FollowerZooKeeperServer) {
FollowerZooKeeperServer fzk =
(FollowerZooKeeperServer) zk;
- for (PacketInFlight p : packetsNotCommitted) {
+ fzk.syncProcessor.setDelayForwarding(true);
+ for (PacketInFlight p : packetsNotLogged) {
fzk.logRequest(p.hdr, p.rec, p.digest);
}
- packetsNotCommitted.clear();
+ packetsNotLogged.clear();
Review Comment:
This was the bug that would cause the learner to crash during sync, because
it "forgot" a previous PROPOSAL on the NEWLEADER, and would then fail to match
up that PROPOSAL with a later COMMIT, if one was sent during the sync.
In turn, this caused the learner to have to re-sync, which could trigger the
same crash again if there was heavy concurrent write traffic, and it would also
give duplicate series of transactions in the transaction logs, with resulting
transaction digest mismatch on that server (but otherwise consistent data
view).
So we need a separation of what's not yet written to the log, and what's not
yet matched with a COMMIT, which is what these two queues are about.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]