sanghyeok An created KAFKA-16670:
------------------------------------
Summary: KIP-848 : Consumer will not receive assignment forever
because of concurrent issue.
Key: KAFKA-16670
URL: https://issues.apache.org/jira/browse/KAFKA-16670
Project: Kafka
Issue Type: Bug
Reporter: sanghyeok An
*Related Code*
* Consumer get assignment Successfully :
**
[https://github.com/chickenchickenlove/new-consumer-error/blob/8c1d74db1ec60350c28f5ed25f595559180dc603/src/test/java/com/example/MyTest.java#L35-L57]
* Consumer get be stuck Forever because of concurrent issue:
**
https://github.com/chickenchickenlove/new-consumer-error/blob/8c1d74db1ec60350c28f5ed25f595559180dc603/src/test/java/com/example/MyTest.java#L61-L79
*Unexpected behaviour*
[|https://github.com/chickenchickenlove/new-consumer-error#unexpected-behaviour]
* Broker is sufficiently slow.
* When a KafkaConsumer is created and immediately subscribes to a topic
If both conditions are met, {{Consumer}} can potentially never receive
{{TopicPartition}} assignments and become stuck indefinitely.
In case of new broker and new consumer, when consumer are created, consumer
background thread send a request to broker. (I guess groupCoordinator Heartbeat
request). In that time, if broker does not load metadata from
{{{}__consumer_offset{}}}, broker will start to schedule load metadata. After
broker load metadata completely, consumer background thread think 'this broker
is valid group coordinator'.
However, consumer can send {{subscribe}} request to broker before {{broker}}
reply about {{{}groupCoordinator HeartBeat Request{}}}. In that case, consumer
seems to be stuck.
If both conditions are met, the {{Consumer}} can potentially never receive
{{TopicPartition}} assignments and may become indefinitely stuck. In the case
of a new {{broker}} and new {{{}consumer{}}}, when the consumer is created,
{{consumer background thread}} start to send a request to the broker. (I
believe this is a {{{}GroupCoordinator Heartbeat request{}}}) During this time,
if the {{broker}} has not yet loaded metadata from {{{}__consumer_offsets{}}},
it will begin to schedule metadata loading. Once the broker has completely
loaded the metadata, the {{consumer background thread}} recognizes this broker
as a valid group coordinator. However, there is a possibility that the
{{consumer}} can send a {{subscribe request}} to the {{broker}} before the
{{broker}} has replied to the {{{}GroupCoordinator Heartbeat Request{}}}. In
such a scenario, the {{consumer}} appears to be stuck.
You can check this scenario, in the
{{{}src/test/java/com/example/MyTest#should_fail_because_consumer_try_to_poll_before_background_thread_get_valid_coordinator{}}}.
If there is no sleep time to wait {{{}GroupCoordinator Heartbeat Request{}}},
{{consumer}} will be always stuck. If there is a little sleep time,
{{consumer}} will always receive assignment.
README :
https://github.com/chickenchickenlove/new-consumer-error/blob/main/README.md
--
This message was sent by Atlassian Jira
(v8.20.10#820010)