kirktrue commented on PR #17035: URL: https://github.com/apache/kafka/pull/17035#issuecomment-2327491994
Hi @AndrewJSchofield! Thanks for the review ๐ > First, I think it better not to overload the PollEvent because that's already used in the share consumer. Agreed. I've introduced a `FetchEvent` so that the two separate mechanisms won't step on each others' toes. > Second, it seems to me that there is still the potential for over-fetching, and this will still cause churn of the fetch session cache. Agreed. It aims to _lessen_ the churn. _Preventing_ the churn completely is a future task. > In the case where the consumer is only fetching a single partition, I think it works pretty well. The set of fetchable partitions will be empty if there's buffered data, and contain the only partition in the fetch session if there is not. So, you'll only send a Fetch request when there's a need for more data and the fetch session will not churn. Correct. > In the case where the consumer is fetching more than one partition on a particular node, if a subset of the partitions is fetchable, then the fetch session will be modified by sending a Fetch request and that seems to have the potential for a lot of churn. Correct again! Any partition with buffered data at the point where the fetch request is being generated will be marked as "removed" from the broker's fetch session cache. That's the crux of the problem ๐ Something that I tend to lose sight of is the fact that it's not a foregone conclusion that a fetch session will be evicted when it has partitions removed. Of course, it will increase its eligibility for eviction if the broker hosting the fetch session is resource-constrained and invokes the eviction process. > Of course, all of this code is in common between the legacy consumer and the async consumer. I'm not sure I follow. This code is all specific to the `AsyncKafkaConsumer`. While the `ClassicKafkaConsumer` has a similar race condition, it is 2-4 orders of magnitude less likely to happen. > The async consumer is still very keen on fetching so I don't properly grasp why this PR would make the fetch session behaviour better. Yepโthe design of the `AsyncKafkaConsumer` fetching continuously in the background makes it _very_ keen to cause this problem. With this change, the application thread now signals when to fetch, which results in the background thread creating and issuing the fetch requests _much_ less often. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
