[
https://issues.apache.org/jira/browse/KAFKA-16225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853823#comment-17853823
]
Greg Harris commented on KAFKA-16225:
-------------------------------------
This PR: [https://github.com/apache/kafka/pull/15335] for KAFKA-16234 appears
to have substantially decreased the flakiness in this test suite: !Screenshot
2024-06-10 at 2.39.54 PM.png! Thanks [~omnia_h_ibrahim] and all of the
reviewers on that fix! Some minor flakiness remains with the following 4 errors
in the past 28 days on trunk:
{noformat}
org.opentest4j.AssertionFailedError: Timeout waiting for controller metadata
propagating to brokers
at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)at
org.junit.jupiter.api.Assertions.fail(Assertions.java:138)
at kafka.utils.TestUtils$.ensureConsistentKRaftMetadata(TestUtils.scala:927)
at
kafka.integration.KafkaServerTestHarness.ensureConsistentKRaftMetadata(KafkaServerTestHarness.scala:422)
at
kafka.server.LogDirFailureTest.setUp(LogDirFailureTest.scala:60){noformat}
{noformat}
org.opentest4j.AssertionFailedError: Consumed 0 records before timeout instead
of the expected 1 records
at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
at org.junit.jupiter.api.Assertions.fail(Assertions.java:138)
at kafka.utils.TestUtils$.pollUntilAtLeastNumRecords(TestUtils.scala:927)
at
kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:200)
at
kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:72)
{noformat}
{noformat}
java.nio.file.FileAlreadyExistsException:
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/core/data/kafka-1608887280612041858
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at
sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.Files.newByteChannel(Files.java:361)
at java.nio.file.Files.createFile(Files.java:632)
at kafka.utils.TestUtils$.causeLogDirFailure(TestUtils.scala:1360)
at
kafka.server.LogDirFailureTest.testProduceErrorsFromLogDirFailureOnLeader(LogDirFailureTest.scala:191)
at
kafka.server.LogDirFailureTest.testProduceErrorFromFailureOnCheckpoint(LogDirFailureTest.scala:137){noformat}
{noformat}
java.nio.file.FileAlreadyExistsException:
/home/jenkins/workspace/Kafka_kafka_trunk/core/data/kafka-13669229279446172937
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:94)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at
sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
at java.nio.file.Files.newByteChannel(Files.java:371)
at java.nio.file.Files.createFile(Files.java:648)
at kafka.utils.TestUtils$.causeLogDirFailure(TestUtils.scala:1360)
at
kafka.server.LogDirFailureTest.testLogDirNotificationTimeout(LogDirFailureTest.scala:89)
{noformat}
> Flaky test suite LogDirFailureTest
> ----------------------------------
>
> Key: KAFKA-16225
> URL: https://issues.apache.org/jira/browse/KAFKA-16225
> Project: Kafka
> Issue Type: Bug
> Components: core, unit tests
> Reporter: Greg Harris
> Assignee: Omnia Ibrahim
> Priority: Major
> Labels: flaky-test
> Attachments: Screenshot 2024-06-10 at 2.39.54 PM.png
>
>
> I see this failure on trunk and in PR builds for multiple methods in this
> test suite:
> {noformat}
> org.opentest4j.AssertionFailedError: expected: <true> but was: <false>
> at
> org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>
> at
> org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>
> at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
> at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
> at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)
> at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:179)
> at kafka.utils.TestUtils$.causeLogDirFailure(TestUtils.scala:1715)
> at
> kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:186)
>
> at
> kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:70){noformat}
> It appears this assertion is failing
> [https://github.com/apache/kafka/blob/f54975c33135140351c50370282e86c49c81bbdd/core/src/test/scala/unit/kafka/utils/TestUtils.scala#L1715]
> The other error which is appearing is this:
> {noformat}
> org.opentest4j.AssertionFailedError: Unexpected exception type thrown,
> expected: <java.util.concurrent.ExecutionException> but was:
> <java.lang.IllegalStateException>
> at
> org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>
> at org.junit.jupiter.api.AssertThrows.assertThrows(AssertThrows.java:67)
> at org.junit.jupiter.api.AssertThrows.assertThrows(AssertThrows.java:35)
> at org.junit.jupiter.api.Assertions.assertThrows(Assertions.java:3111)
> at
> kafka.server.LogDirFailureTest.testProduceErrorsFromLogDirFailureOnLeader(LogDirFailureTest.scala:164)
>
> at
> kafka.server.LogDirFailureTest.testProduceErrorFromFailureOnLogRoll(LogDirFailureTest.scala:64){noformat}
> Failures appear to have started in this commit, but this does not indicate
> that this commit is at fault:
> [https://github.com/apache/kafka/tree/3d95a69a28c2d16e96618cfa9a1eb69180fb66ea]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)