[ 
https://issues.apache.org/jira/browse/KAFKA-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895107#comment-16895107
 ] 

Mattia Barbon commented on KAFKA-8726:
--------------------------------------

{{An excerpt of producer logs around the time of the issue:}}


{{WARN  [2019-07-29 09:42:02.209] [51263:kafka-producer-network-thread | 
com-prd-1] org.apache.kafka.clients.producer.internals.Sender: [Producer 
clientId=com-prd-1, transactionalId=com-prd-1] Got error produce response in 
correlation id 814422 on topic-partition topic-32, splitting and retrying 
(2147483647 attempts left). Error: MESSAGE_TOO_LARGE}}
{{WARN  [2019-07-29 09:42:02.350] [51263:kafka-producer-network-thread | 
com-prd-1] org.apache.kafka.clients.producer.internals.Sender: [Producer 
clientId=com-prd-1, transactionalId=com-prd-1] Got error produce response with 
correlation id 814424 on topic-partition topic-32, retrying (2147483646 
attempts left). Error: OUT_OF_ORDER_SEQUENCE_NUMBER}}
{{WARN  [2019-07-29 09:42:05.093] [51263:kafka-producer-network-thread | 
com-prd-1] processor.SubstreamWriterBase: Send exception (from callback)}}
{{ERROR [2019-07-29 09:42:05.093] [51263:kafka-producer-network-thread | 
com-prd-1] org.apache.kafka.clients.producer.internals.Sender: [Producer 
clientId=com-prd-1, transactionalId=com-prd-1] Uncaught error in kafka producer 
I/O thread: }}
{{WARN  [2019-07-29 09:42:05.103] [51263:kafka-producer-network-thread | 
com-prd-1] processor.SubstreamWriterBase: Send exception (from callback)}}
{{! org.apache.kafka.common.errors.TimeoutException: Expiring 590 record(s) for 
topic-32:3035 ms has passed since batch creation}}

{{<removed a few hundreds of similar errors>}}

{{WARN  [2019-07-29 09:42:09.212] [51263:kafka-producer-network-thread | 
com-prd-1] org.apache.kafka.clients.producer.internals.Sender: [Producer 
clientId=com-prd-1, transactionalId=com-prd-1] Got error produce response with 
correlation id 814535 on topic-partition topic2-32, retrying (2147483646 
attempts left). Error: OUT_OF_ORDER_SEQUENCE_NUMBER}}

{{<snipped a few hundreds of OUT_OF_ORDER_SEQUENCE_NUMBER errors>}}

{{WARN  [2019-07-29 09:42:12.072] [51263:kafka-producer-network-thread | 
com-prd-1] org.apache.kafka.clients.producer.internals.Sender: [Producer 
clientId=com-prd-1, transactionalId=com-prd-1] Got error produce response with 
correlation id 814817 on topic-partition topic3-32, retrying (2147483618 
attempts left). Error: OUT_OF_ORDER_SEQUENCE_NUMBER}}
{{WARN  [2019-07-29 09:42:12.080] [51263:kafka-producer-network-thread | 
com-prd-1] processor.SubstreamWriterBase: Send exception (from callback)}}
{{! org.apache.kafka.common.errors.TimeoutException: Expiring 2297 record(s) 
for topic5-32:3000 ms has passed since batch creation}}

{{<removed a few hundreds of similar errors>}}

{{ERROR [2019-07-29 09:42:12.082] [51263:kafka-producer-network-thread | 
com-prd-1] org.apache.kafka.clients.producer.internals.Sender: [Producer 
clientId=com-prd-1, transactionalId=com-prd-1] Aborting producer batches due to 
fatal error}}
{{! org.apache.kafka.common.KafkaException: The client hasn't received 
acknowledgment for some previously sent messages and can no longer retry them. 
It isn't safe to continue.}}

{{<removed a few hundreds of similar errors>}}

> Producer can't abort a transaction aftersome send errors
> --------------------------------------------------------
>
>                 Key: KAFKA-8726
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8726
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, producer 
>    Affects Versions: 2.3.0
>            Reporter: Mattia Barbon
>            Priority: Major
>
> I am following the producer with transactions example in 
> [https://kafka.apache.org/23/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html,]
>  and on kafkaException, I use abortTransaction and retry.
>  
> In some cases, abortTransaction fails, with:
> ```
> org.apache.kafka.common.KafkaException: Cannot execute transactional method 
> because we are in an error state
> ```
> as far as I can tell, this is caused by
> ```
> org.apache.kafka.common.KafkaException: The client hasn't received 
> acknowledgment for some previously sent messages and can no longer retry 
> them. It isn't safe to continue.
> ```
>  
> Since both are KafkaException, the example seems to imply they are retriable, 
> but they seem not to be. Ideally, I would expect abortTransaction to succeed 
> in this case (the broker will abort the transaction anyway because it can't 
> be committed), but at the very least, I would expect to have a way to 
> determine that the producer is unusable and it can't recover.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to