C0urante opened a new pull request, #16757:
URL: https://github.com/apache/kafka/pull/16757

   We're still seeing some flaky test failures for the 
`OffsetsApiIntegrationTest` suite, but in a much smaller subset of cases:
   - testAlterSinkConnectorOffsetsDifferentKafkaClusterTargeted: 
[1](https://ge.apache.org/s/ydrj6vsi2t5mw/tests/task/:connect:runtime:test/details/org.apache.kafka.connect.integration.OffsetsApiIntegrationTest/testAlterSinkConnectorOffsetsDifferentKafkaClusterTargeted()?top-execution=1),
 
[2](https://ge.apache.org/s/ndddgm7c6ma3c/tests/task/:connect:runtime:test/details/org.apache.kafka.connect.integration.OffsetsApiIntegrationTest/testAlterSinkConnectorOffsetsDifferentKafkaClusterTargeted()?top-execution=1)
   - testResetSinkConnectorOffsetsDifferentKafkaClusterTargeted: 
[1](https://ge.apache.org/s/ydrj6vsi2t5mw/tests/task/:connect:runtime:test/details/org.apache.kafka.connect.integration.OffsetsApiIntegrationTest/testResetSinkConnectorOffsetsDifferentKafkaClusterTargeted()?top-execution=1),
 
[2](https://ge.apache.org/s/ndddgm7c6ma3c/tests/task/:connect:runtime:test/details/org.apache.kafka.connect.integration.OffsetsApiIntegrationTest/testResetSinkConnectorOffsetsDifferentKafkaClusterTargeted()?top-execution=1)
   - testGetSinkConnectorOffsetsDifferentKafkaClusterTargeted: 
[1](https://ge.apache.org/s/ydrj6vsi2t5mw/tests/task/:connect:runtime:test/details/org.apache.kafka.connect.integration.OffsetsApiIntegrationTest/testGetSinkConnectorOffsetsDifferentKafkaClusterTargeted()?top-execution=1)
   
   After examining log files, it looks like this is a genuine case of timeouts 
being too low; in the three failures I examined, the sink connector's consumer 
group was never able to form or handle offset commits because the separate 
Kafka cluster it targeted didn't have a group coordinator and was still 
creating the internal offsets topic.
   
   Instead of just increasing timeouts, I'd like to add an enhanced cluster 
readiness check for our `EmbeddedKafkaCluster` class that's automatically 
performed on startup. This way, not only do we give the test cases above more 
time to run (assuming that a significant amount of their runtime is currently 
taken up by bringing up the separate Kafka cluster), we also have better 
insight into whether future failures are caused by broker startup issues or 
something else.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to