mjsax commented on code in PR #18587:
URL: https://github.com/apache/kafka/pull/18587#discussion_r1920894860
##########
clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java:
##########
@@ -1134,6 +1129,13 @@ private ClusterAndWaitTime waitOnMetadata(String topic,
Integer partition, long
return new ClusterAndWaitTime(cluster, elapsed);
}
+ private String getErrorMessage(Integer partitionsCount, String topic,
Integer partition, long maxWaitMs) {
+ return partitionsCount == null ?
+ String.format("Topic %s not present in metadata after %d ms.",
+ topic, maxWaitMs) :
+ String.format("Partition %d of topic %s with partition count %d is
not present in metadata after %d ms.",
+ partition, topic, partitionsCount, maxWaitMs);
+ }
Review Comment:
> Is there any value in including the partition value in the error message
if it's non-null?
Yes. `partition` is an input parameter to the method, and it's only `null`
if the caller does not care about the metadata of a specific partitions (as
pointed out by Andrew). So it's if not-null, the user cares and we should log
it.
If `partition != null`, it helps to identify the case of a non-existing
partition the user wants to write into. Assume a topic with 4 partitions, and
user specific partition 5 as explicit argument when calling `send()`. The
producer would try to learn about the leader broker for partition 5, but it
does not exist, and it would loop until `max.block.ms` is exhausted and finally
fail with: `Partition 5 of topic <FOO> with partition count 4 is not present in
metadata after XXX ms.` -- Then we can suspect that the topic may really only
have 4 partitions... W/o `Partition 5` it's totally unclear why the metadata
could not be found and why we hit `max.block.ms` on `send()`; the generic error
`Topic %s not present in metadata after %d ms.` does not help us to identify
this case.
Or did you mean, if `partition == null`? For this case, I agree it's not
really helpful to log `partition` (as the user does not care anyway), but it
think we don't log it anyway given the current code... (cf below)
> Seems to me that 3 messages would be best so that we never get a "null" as
an ugly insert. wdyt?
I believe the code is correct as-is. There is only two cases:
1. `partition == null`. If we get metadata successfully, it implies that
`partitionCount != null` and we just move on and don't throw
`TimeoutException`. If we cannot get any metadata for the topic, we stay with
`partitionCount == null` and use `Topic %s not present in metadata after %d
ms.` for the `TimeoutException`.
2. `partition != null`. For this case, if we get metadata successfully, we
have a `partitionCount`, but we might still fail, if we don't get the metadata
for the specified partition. For this case, we would use `Partition %d of topic
%s with partition count %d is not present in metadata after %d ms` as error
message. The other failure case is, that we don't get any metadata for the
topic at all, and thus `partitionCount == null` and we still log the right
thing. The fact that `partition != null` does not matter if `partitionCount ==
null`.
Hence, case (3) is not an error case... If we did get metadata
(`partitionCount != null`) and if the user does not care (`partition == null`),
we will never throw `TimeoutException`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]