[
https://issues.apache.org/jira/browse/ZOOKEEPER-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149943#comment-13149943
]
Patrick Hunt commented on ZOOKEEPER-1277:
-----------------------------------------
I thought about that but it seemed like a bad idea for 2 reasons I could think
of:
1) it would cause all of the clients to disconnect and reconnect unnecessarily,
perhaps introducing instability in the process.
2) can we guarantee that the leader will give up leadership? ie how to effect
this, exit the JVM on the leader?
In talking with Ben about it in the past (perhaps he's since changed his mind)
he seemed to think that rolling over to a new epoch number (with no leader
re-election) was OK.
> servers stop serving when lower 32bits of zxid roll over
> --------------------------------------------------------
>
> Key: ZOOKEEPER-1277
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1277
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.3.3
> Reporter: Patrick Hunt
> Assignee: Patrick Hunt
> Priority: Blocker
> Fix For: 3.3.4
>
> Attachments: ZOOKEEPER-1277_br33.patch
>
>
> When the lower 32bits of a zxid "roll over" (zxid is a 64 bit number, however
> the upper 32 are considered the epoch number) the epoch number (upper 32
> bits) are incremented and the lower 32 start at 0 again.
> This should work fine, however in the current 3.3 branch the followers see
> this as a NEWLEADER message, which it's not, and effectively stop serving
> clients. Attached clients seem to eventually time out given that heartbeats
> (or any operation) are no longer processed. The follower doesn't recover from
> this.
> I've tested this out on 3.3 branch and confirmed this problem, however I
> haven't tried it on 3.4/3.5. It may not happen on the newer branches due to
> ZOOKEEPER-335, however there is certainly an issue with updating the
> "acceptedEpoch" files contained in the datadir. (I'll enter a separate jira
> for that)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira