[
https://issues.apache.org/jira/browse/KAFKA-16460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896841#comment-17896841
]
Kirk True commented on KAFKA-16460:
-----------------------------------
Looking at this in depth, there's an interesting race condition with the test
itself because it creates multiple consumers all vying for a single partition.
The consumers all use auto-commit to record their progress.
In the example test failure I inspected, this was the basic order of events:
# Consumer A is assigned the partition
# Consumer A reads 500 records
# Consumer A records a {{records_consumed}} value of 500 (see
[^verifiable_consumer_worker_A.stdout])
# Consumer A's assignment is revoked before it can (auto-)commit its offsets
# Consumer B is assigned the partition
# Consumer B reads multiple batches of records up to the full number of
records produced (see [^verifiable_consumer_worker_B.stdout]), outputs them to
the JSON, and commits the offsets
So from the test runner's standpoint, it sees:
* Consumer A's {{{}records_consumed{}}}: 500
* Consumer B's {{{}records_consumed{}}}: 9891
In the error message, the value for "consumed records" is the summation of all
the {{records_consumed}} values the test received from the workers (consumer A
and B). The value for "consumed position" is the latest value of the
{{minOffset}} value the test received from the workers (consumer B, in this
example).
When the number of consumed records from both workers are added, you get the
value of 10391.
> New consumer times out consuming records in multiple consumer_test.py system
> tests
> ----------------------------------------------------------------------------------
>
> Key: KAFKA-16460
> URL: https://issues.apache.org/jira/browse/KAFKA-16460
> Project: Kafka
> Issue Type: Bug
> Components: clients, consumer, system tests
> Affects Versions: 3.7.0
> Reporter: Kirk True
> Assignee: PoAn Yang
> Priority: Critical
> Labels: kip-848-client-support, system-tests
> Fix For: 4.0.0
>
> Attachments: verifiable_consumer_worker_A.log,
> verifiable_consumer_worker_A.stdout, verifiable_consumer_worker_B.log,
> verifiable_consumer_worker_B.stdout
>
>
> The {{consumer_test.py}} system test fails with the following errors:
> {quote}
> * Timed out waiting for consumption
> {quote}
> Affected tests:
> * {{test_broker_failure}}
> * {{test_consumer_bounce}}
> * {{test_static_consumer_bounce}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)