[
https://issues.apache.org/jira/browse/KAFKA-16984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lianet Magrans updated KAFKA-16984:
-----------------------------------
Description:
When the new consumer attempts to leave a group, it sends a leave group request
in a fire-and-forget mode, so, as soon as the request is generated, it will:
1. transitions to UNSUBSCRIBED
2. complete the leaveGroup operation future
This task focus on point 2, which has the undesired side-effect that whatever
might have been waiting for the leave to do something else, will carry on, ex.
consumer close, leading to responses to disconnected clients we've seen when
running stress tests)
When leaving a group while closing a consumer, the member sends the leave
request and moves on to next operation, which is closing the network thread, so
we end up with disconnected client receiving responses from the server. We
should send leave group heartbeat, and transition to UNSUBSCRIBE, but only
complete the leave operation when we get a response for it, which is a much
more accurate confirmation that the consumer left the group and can move on
with other operations.
Note that the legacy consumer does wait for a leave response before closing
down the coordinator (see
[AbstractCoordinator|https://github.com/apache/kafka/blob/25230b538841a5e7256b1b51725361dd59435901/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L1135-L1140]),
we we are looking to have the same behaviour on the new consumer.
Note that with this task we'll only focus on changing the behaviour for the
leave operation completion (point 2 above) to tidy up the close flow. We are
not changing the transition to UNSUBSCRIBED, as it would require further
consideration if ever needed.
This is also a building block for future improvements around error handling for
the leave request, which we don't have at the moment (related Jira linked)
was:
When the new consumer attempts to leave a group, it sends a leave group request
in a fire-and-forget mode, so, as soon as the request is generated, it will:
1. transitions to UNSUBSCRIBED
2. complete the leaveGroup operation future
This task focus on point 2, which has the undesired side-effect that whatever
might have been waiting for the leave to do something else, will carry on, ex.
consumer close, leading to responses to disconnected clients we've seen when
running stress tests)
When leaving a group while closing a consumer, the member sends the leave
request and moves on to next operation, which is closing the network thread, so
we end up with disconnected client receiving responses from the server. We
should send leave group heartbeat, and transition to UNSUBSCRIBE, but only
complete the leave operation when we get a response for it, which is a much
more accurate confirmation that the consumer left the group and can move on
with other operations.
Note that the legacy consumer does wait for a leave response before closing
down the coordinator (see
[AbstractCoordinator|https://github.com/apache/kafka/blob/25230b538841a5e7256b1b51725361dd59435901/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L1135-L1140]),
we we are looking to have the same behaviour on the new consumer.
Note that with this task we'll only focus on changing the behaviour for the
leave operation completion (point 2 above) to tidy up the close flow. We are
not changing the transition to UNSUBSCRIBED, as it would require further
consideration if ever needed.
> New consumer should not complete leave operation until it gets a response
> -------------------------------------------------------------------------
>
> Key: KAFKA-16984
> URL: https://issues.apache.org/jira/browse/KAFKA-16984
> Project: Kafka
> Issue Type: Bug
> Components: clients, consumer
> Affects Versions: 3.8.0
> Reporter: Lianet Magrans
> Assignee: Lianet Magrans
> Priority: Major
> Labels: kip-848
> Fix For: 3.9.0
>
>
> When the new consumer attempts to leave a group, it sends a leave group
> request in a fire-and-forget mode, so, as soon as the request is generated,
> it will:
> 1. transitions to UNSUBSCRIBED
> 2. complete the leaveGroup operation future
> This task focus on point 2, which has the undesired side-effect that whatever
> might have been waiting for the leave to do something else, will carry on,
> ex. consumer close, leading to responses to disconnected clients we've seen
> when running stress tests)
> When leaving a group while closing a consumer, the member sends the leave
> request and moves on to next operation, which is closing the network thread,
> so we end up with disconnected client receiving responses from the server. We
> should send leave group heartbeat, and transition to UNSUBSCRIBE, but only
> complete the leave operation when we get a response for it, which is a much
> more accurate confirmation that the consumer left the group and can move on
> with other operations.
> Note that the legacy consumer does wait for a leave response before closing
> down the coordinator (see
> [AbstractCoordinator|https://github.com/apache/kafka/blob/25230b538841a5e7256b1b51725361dd59435901/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L1135-L1140]),
> we we are looking to have the same behaviour on the new consumer.
> Note that with this task we'll only focus on changing the behaviour for the
> leave operation completion (point 2 above) to tidy up the close flow. We are
> not changing the transition to UNSUBSCRIBED, as it would require further
> consideration if ever needed.
>
> This is also a building block for future improvements around error handling
> for the leave request, which we don't have at the moment (related Jira linked)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)