[
https://issues.apache.org/jira/browse/KAFKA-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lianet Magrans resolved KAFKA-17170.
------------------------------------
Fix Version/s: 4.0.0
Reviewer: Lianet Magrans
Resolution: Fixed
> Add test to ensure new consumer acks reconciled assignment even if first HB
> with ack lost
> -----------------------------------------------------------------------------------------
>
> Key: KAFKA-17170
> URL: https://issues.apache.org/jira/browse/KAFKA-17170
> Project: Kafka
> Issue Type: Task
> Components: clients, consumer
> Reporter: Lianet Magrans
> Assignee: 黃竣陽
> Priority: Minor
> Labels: kip-848-client-support, newbie
> Fix For: 4.0.0
>
>
> When a consumer reconciles an assignment, it transitions to ACKNOWLEDGING, so
> that a HB is sent on the next manager poll, without waiting for the interval.
> The consumer transitions out of this ack state as soon as it sends the
> heartbeat, without waiting for a response. This is based on the expectation
> that following heartbeats (sent on the interval) will act as ack, including
> the set of partitions even in case the first ack is lost. This is the
> expected flow:
> # complete reconciliation and send HB1 to ack assignment tp0
> # HB1 times out (or fails in any way) => heartbeat request manager resets
> the sentFields to null (HeartbeatState.reset() , triggered if the request
> fails, or if it gets a response with an Error)
> # following HB will include tp0 (and act as ack), because it will notice
> that tp0 != null (last value sent)
> This seems not to be covered by any test, so we should add a unit test to the
> HeartbeatRequestManager, to ensure that the HB generated in step 4 above
> includes tp0 as I expect :), considering both cases of error: request fails
> (no response) and request gets a response with an Error in it.
> This flow is important because if failing to send the reconciled partitions
> in a HB, the broker would remain waiting for an ack that the member would
> considered it already sent (the broker would wait for the rebalance timeout
> before re-assigning those partitions)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)