Lianet Magrans created KAFKA-20165:
--------------------------------------

             Summary: Consumer poll may fail with unrecoverable KafkaException 
if topic not in metadata when fetching committed offsets
                 Key: KAFKA-20165
                 URL: https://issues.apache.org/jira/browse/KAFKA-20165
             Project: Kafka
          Issue Type: Bug
          Components: clients, consumer
    Affects Versions: 4.2.0
            Reporter: Lianet Magrans
            Assignee: Lianet Magrans
             Fix For: 4.2.1


The consumer wraps Unknown topic errors as KafkaException when handling 
responses to OffsetFetch request. This makes that, if the topic is unavailable 
while initializing positions with committed offsets on poll, the poll loop will 
break. 

The expectation is that it should continue to retry (those are retriable 
exceptions) while the poll timeout hasn't expired.

Seems to me that the case was that at some point, unknown topic errors where 
used when the case was missing auth. So at that moment it made sense to treat 
unknown topic error as fatal. But auth errors have their own path (broker 
returns topic_auth_exception), already handled as fatal on the client side. 

Even though the logic to handle Unknonw topics as fatal is in both consumers, 
seems that the issue would happen only on the new consumer when using topic IDs 
(AK 4.2 only), becasue:
 * classic consumers fails fatally only under UNKNOWN_TOPIC_OR_PARTITION and 
the broker doesn't seem to return this error on the OffsetFetch path -> so the 
classic consumer fail on this retriable will never be triggered
 * async consumer using topics IDs handles both, 
UNKNOWN_TOPIC_ID and UNKNOWN_TOPIC_OR_PARTITION, and the broker does return 
UNKNOWN_TOPIC_ID on offsetFetch if the topic ID requested is not available in 
metadata 

classic consumer handling unknown topic on fetch offsets as fatal: 
[https://github.com/apache/kafka/blob/ee72f90742088b401af7572c7cb499394c0521f7/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java#L1555]
 

same handling in new consumer: 
[https://github.com/apache/kafka/blob/ee72f90742088b401af7572c7cb499394c0521f7/clients/src/main/java/org/apache/kafka/clients/consumer/internals/CommitRequestManager.java#L1156]
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to