[
https://issues.apache.org/jira/browse/KAFKA-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849121#comment-17849121
]
Alyssa Huang commented on KAFKA-16530:
--------------------------------------
In the case the leader is removed from the voter set, and tries to update its
log end offset (`updateLocalState`) because of a new removeNode record for
instance, it will first update its own ReplicaState (`getOrCreateReplicaState`)
which will return a _new_ Observer state if its id is no longer in the
`voterStates` map. The endOffset will be updated, and then we'll consider if
the high watermark can be updated (`maybeUpdateHighWatermark`).
When updating the high watermark, we only look at the `voterStates` map, which
means we won't count the leader's offset as part of the HW calculation. This
_does_ mean it's possible for the HW to drop though. Here's a scenario:
{code:java}
# Before node 1 removal, voterStates contains Nodes 1, 2, 3
Node 1: Leader, LEO 100
Node 2: Follower, LEO 90 <- HW
Node 3: Follower, LEO 85
# Leader processes removeNode record, voterStates contains Nodes 2, 3
Node 1: Leader, LEO 101
Node 2: Follower, LEO 90
Node 3: Follower, LEO 85 <- new HW{code}
We want to make sure the HW does not decrement in this scenario. Perhaps we
could revise `maybeUpdateHighWatermark` to continue to factor in the Leader's
offset into the HW calculation, regardless of if it is in the voter set or not.
e.g.
{code:java}
private boolean maybeUpdateHighWatermark() {
// Find the largest offset which is replicated to a majority of replicas
(the leader counts)
- List<ReplicaState> followersByDescendingFetchOffset =
followersByDescendingFetchOffset();
+ List<ReplicaState> followersAndLeaderByDescFetchOffset =
followersAndLeadersByDescFetchOffset();
- int indexOfHw = voterStates.size() / 2;
+ int indexOfHw = followersByDescendingFetchOffset.size() / 2;
Optional<LogOffsetMetadata> highWatermarkUpdateOpt =
followersByDescendingFetchOffset.get(indexOfHw).endOffset;{code}
However, this does not cover the case when a follower is being removed from the
voter set.
{code:java}
# Before node 2 removal, voterStates contains Nodes 1, 2, 3
Node 1: Leader, LEO 100
Node 2: Follower, LEO 90 <- HW
Node 3: Follower, LEO 85
# Leader processes removeNode record, voterStates contains Nodes 1, 3
Node 1: Leader, LEO 101
Node 2: Follower, LEO 90
Node 3: Follower, LEO 85 <- new HW{code}
> Fix high-watermark calculation to not assume the leader is in the voter set
> ---------------------------------------------------------------------------
>
> Key: KAFKA-16530
> URL: https://issues.apache.org/jira/browse/KAFKA-16530
> Project: Kafka
> Issue Type: Sub-task
> Components: kraft
> Reporter: José Armando García Sancio
> Assignee: Alyssa Huang
> Priority: Major
> Fix For: 3.8.0
>
>
> When the leader is being removed from the voter set, the leader may not be in
> the voter set. This means that kraft should not assume that the leader is
> part of the high-watermark calculation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)