[
https://issues.apache.org/jira/browse/KAFKA-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884601#comment-17884601
]
Yu-Lin Chen commented on KAFKA-17515:
-------------------------------------
Reopen this issue because the slow stream task initialization has still been
occured over the last 5 days. ([Report
Link|https://ge.apache.org/scans/tests?search.startTimeMax=1727263640181&search.startTimeMin=1726761600000&search.tags=trunk&search.timeZoneId=Asia%2FTaipei&tests.container=org.apache.kafka.streams.integration.RestoreIntegrationTest&tests.test=shouldInvokeUserDefinedGlobalStateRestoreListener()],
5 of 115 run took longer than 60s in trunk)
Flaky tests:
# Sep 25 2024 at 09:04:01 CST
([Link|https://ge.apache.org/s/g5hopxjvfwdey/tests/task/:streams:test/details/org.apache.kafka.streams.integration.RestoreIntegrationTest/shouldInvokeUserDefinedGlobalStateRestoreListener()?top-execution=1]):
62 sec for ks1 stream tasks initialization (44 sec for rebalancing)
# Sep 24 2024 at 21:46:08 CST
([Link|https://ge.apache.org/s/jphpbv3mlmvca/tests/task/:streams:test/details/org.apache.kafka.streams.integration.RestoreIntegrationTest/shouldInvokeUserDefinedGlobalStateRestoreListener()?top-execution=1]):
59 sec for ks1 stream tasks initialization (43 sec for rebalancing)
# Sep 23 2024 at 21:53:23 CST
([Link|https://ge.apache.org/s/bngb54dybplky/tests/task/:streams:test/details/org.apache.kafka.streams.integration.RestoreIntegrationTest/shouldInvokeUserDefinedGlobalStateRestoreListener()?top-execution=1]):
70 sec for ks1 stream tasks initialization (45 sec for rebalancing)
# Sep 23 2024 at 21:37:09 CST
([Link|https://ge.apache.org/s/crnlwfthyofpw/tests/task/:streams:test/details/org.apache.kafka.streams.integration.RestoreIntegrationTest/shouldInvokeUserDefinedGlobalStateRestoreListener()?top-execution=1]):
61 sec for ks1 stream tasks initialization (45 sec for rebalancing)
# Sep 23 2024 at 12:24:00 CST
([Link|https://ge.apache.org/s/2fe6eo4xauoac/tests/task/:streams:test/details/org.apache.kafka.streams.integration.RestoreIntegrationTest/shouldInvokeUserDefinedGlobalStateRestoreListener()?top-execution=1]):
63 sec for ks1 stream tasks initialization (49 sec for rebalancing)
Will create another PR to increase the timeout. (Current 60 sec)
> Fix flaky
> RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener
> ----------------------------------------------------------------------------------
>
> Key: KAFKA-17515
> URL: https://issues.apache.org/jira/browse/KAFKA-17515
> Project: Kafka
> Issue Type: Bug
> Components: streams, unit tests
> Reporter: Chia-Ping Tsai
> Assignee: Yu-Lin Chen
> Priority: Major
> Fix For: 4.0.0
>
> Attachments: Reproduced screenshoot in my env (Loop 7).png
>
>
> {code:java}
> Stacktrace
> java.nio.file.DirectoryNotEmptyException:
> /tmp/shouldInvokeUserDefinedGlobalStateRestoreListenerH0u0n9foRY_peZu4FqeGHQ10111145955704739924-ks1/shouldInvokeUserDefinedGlobalStateRestoreListenerH0u0n9foRY_peZu4FqeGHQ/0_0
> at
> java.base/sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:289)
> at
> java.base/sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:109)
> at java.base/java.nio.file.Files.deleteIfExists(Files.java:1191)
> at
> org.apache.kafka.common.utils.Utils$1.postVisitDirectory(Utils.java:898)
> at
> org.apache.kafka.common.utils.Utils$1.postVisitDirectory(Utils.java:870)
> at java.base/java.nio.file.Files.walkFileTree(Files.java:2803)
> at java.base/java.nio.file.Files.walkFileTree(Files.java:2857)
> at org.apache.kafka.common.utils.Utils.delete(Utils.java:870)
> at
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.purgeLocalStreamsState(IntegrationTestUtils.java:266)
> at
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.purgeLocalStreamsState(IntegrationTestUtils.java:278)
> at
> org.apache.kafka.streams.integration.RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener(RestoreIntegrationTest.java:583)
> at java.base/java.lang.reflect.Method.invoke(Method.java:580)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)