tashoyan opened a new pull request, #12777:
URL: https://github.com/apache/camel/pull/12777

   Use a single Kafka consumer with `assign()` call to consume all cached 
entries from the beginning of the Kafka topic.
   
   # Description
   
   The existing implementation of `KafkaIdempotentRepository` may fail to 
restore the cached entries on application restart. This implementation is based 
on `KafkaConsumer.subscribe()` method and KafkaConsumer rebalancing. It 
navigates the Kafka consumer to the beginning of the Kafka topic using 
`ConsumerRebalanceListener.onPartitionsAssigned()` callback. However, the 
rebalancing occurs asynchronously. Therefore there is a chance, that first few 
calls to `KafkaConsumer.poll()` will be done before the consumer gets navigated 
to the beginning of the KafkaTopic. Hence, KafkaConsumer gets zero records from 
the Kafka topic and `KafkaIdempotentRepository` gets initialized with empty 
cache. With empty cache, `KafkaIdempotentRepository` re-consumes the entire 
input without any deduplication.
   
   The proposed implementation does not rely on Kafka consumer groups at all. 
It has the only KafkaConsumer and explicitly assigns topic partitions using 
`KafkaConsumer.assign()` method. Actually `KafkaIdempotentRepository` needs 
only one KafkaConsumer - no need to balance the load across many consumers.
   
   The proposed implementation does not need Kafka consumer groupId. The 
parameter `groupId` is still accepted by the public API, but ignored. All 
constructors and methods dealing with groupId are marked as deprecated.
   
   There is no automated test for this change, because the problem can be 
reproduced only when restarting an application, and the Kafka topic should 
already have some cached entries. I have tested this fix manually on a real 
environment. The existing automated tests run successfully.
   
   # Target
   
   main
   
   # Tracking
   https://issues.apache.org/jira/browse/CAMEL-20218
   
   # Apache Camel coding standards and style
   
   - [ v] I checked that each commit in the pull request has a meaningful 
subject line and body.
   
   - [ v] I have run `mvn clean install -DskipTests` locally and I have 
committed all auto-generated changes
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@camel.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to