[
https://issues.apache.org/jira/browse/KAFKA-19242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sanghyeok An updated KAFKA-19242:
---------------------------------
Description:
There has been the issue that events skipped in group rebalancing
([spring-projects/spring-kafka#3703|https://github.com/spring-projects/spring-kafka/issues/3703])
.
At the first, I thought it was caused from spring kafka.
However, After digging into the problem with debug, I concluded it was a
race condition issue in Kafka.
A race condition between the main thread and the consumer coordinator's
heartbeat thread exists when the main thread attempts to commit via
{{commitSync(...)}} while the consumer coordinator thread is handling
consumer group rebalancing.
Especially, this race condition will cause event skip frequently in case
of {{CooperativeSticky}} in used.
For more details, please refer to sequence diagram below.
!image-2025-05-05-19-19-06-147.png|width=592,height=350!
> Fix commit bugs caused by race condition during rebalancing.
> ------------------------------------------------------------
>
> Key: KAFKA-19242
> URL: https://issues.apache.org/jira/browse/KAFKA-19242
> Project: Kafka
> Issue Type: Bug
> Reporter: sanghyeok An
> Assignee: sanghyeok An
> Priority: Major
> Attachments: image-2025-05-05-19-19-06-147.png
>
>
> There has been the issue that events skipped in group rebalancing
> ([spring-projects/spring-kafka#3703|https://github.com/spring-projects/spring-kafka/issues/3703])
> .
> At the first, I thought it was caused from spring kafka.
> However, After digging into the problem with debug, I concluded it was a
> race condition issue in Kafka.
> A race condition between the main thread and the consumer coordinator's
> heartbeat thread exists when the main thread attempts to commit via
> {{commitSync(...)}} while the consumer coordinator thread is handling
> consumer group rebalancing.
> Especially, this race condition will cause event skip frequently in case
> of {{CooperativeSticky}} in used.
> For more details, please refer to sequence diagram below.
>
> !image-2025-05-05-19-19-06-147.png|width=592,height=350!
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)