[
https://issues.apache.org/jira/browse/KAFKA-13674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hong Yi Zhang updated KAFKA-13674:
----------------------------------
Description:
It appears to be the similar issue with
[KAFKA-13391|https://issues.apache.org/jira/browse/KAFKA-13391] on ZOS due to
fsync the parent directory. Kafka 3.0.0 failed to start on z/OS against the
below error:
{panel}
ERROR Error while writing to checkpoint file
/xxxx/recovery-point-offset-checkpoint (kafka.server.LogDirFailureChannel)
java.io.IOException: EDC5121I Invalid argument.
at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:110)
at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:400)
at org.apache.kafka.common.utils.Utils.flushDir(Utils.java:959)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:944)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:918)
at org.apache.kafka.server.common.CheckpointFile.write(CheckpointFile.java:98)
at
kafka.server.checkpoints.CheckpointFileWithFailureHandler.write(CheckpointFileWithFailureHandler.scala:37)
at
kafka.server.checkpoints.OffsetCheckpointFile.write(OffsetCheckpointFile.scala:68)
at
kafka.log.LogManager.$anonfun$checkpointRecoveryOffsetsInDir$1(LogManager.scala:679)
at
kafka.log.LogManager.$anonfun$checkpointRecoveryOffsetsInDir$1$adapted(LogManager.scala:675)
at kafka.log.LogManager$$Lambda$553/0x00000000067b3f30.apply(Unknown Source)
at scala.Option.foreach(Option.scala:437)
at kafka.log.LogManager.checkpointRecoveryOffsetsInDir(LogManager.scala:675)
at kafka.log.LogManager.$anonfun$shutdown$10(LogManager.scala:546)
at kafka.log.LogManager.$anonfun$shutdown$10$adapted(LogManager.scala:539)
at kafka.log.LogManager$$Lambda$548/0x00000000067b1230.apply(Unknown Source)
at
kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
at kafka.log.LogManager$$Lambda$549/0x00000000067b1b30.apply(Unknown Source)
at scala.collection.mutable.HashMap$Node.foreachEntry(HashMap.scala:633)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:499)
at kafka.log.LogManager.shutdown(LogManager.scala:539)
at kafka.server.KafkaServer.$anonfun$shutdown$19(KafkaServer.scala:765)
at
kafka.server.KafkaServer$$Lambda$528/0x00000000c23c3630.apply$mcV$sp(Unknown
Source)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68)
at kafka.server.KafkaServer.shutdown(KafkaServer.scala:765)
at kafka.server.KafkaServer.startup(KafkaServer.scala:466)
at kafka.Kafka$.main(Kafka.scala:109)
at kafka.Kafka.main(Kafka.scala)
{panel}
This problem is a blocker issue on Kafka 3.0.0 above on z/OS. We suspect that
the attempt to fsync the directory should just be skipped on z/OS. We have
tried to skip fsync code and then Kafka 3.0.0 can be started successfully.
was:
It appears to be the similar issue with [#KAFKA-13391]on ZOS due to fsync the
parent directory. Kafka 3.0.0 failed to start on z/OS against the below error:
{panel}
ERROR Error while writing to checkpoint file
/xxxx/recovery-point-offset-checkpoint (kafka.server.LogDirFailureChannel)
java.io.IOException: EDC5121I Invalid argument.
at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:110)
at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:400)
at org.apache.kafka.common.utils.Utils.flushDir(Utils.java:959)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:944)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:918)
at org.apache.kafka.server.common.CheckpointFile.write(CheckpointFile.java:98)
at
kafka.server.checkpoints.CheckpointFileWithFailureHandler.write(CheckpointFileWithFailureHandler.scala:37)
at
kafka.server.checkpoints.OffsetCheckpointFile.write(OffsetCheckpointFile.scala:68)
at
kafka.log.LogManager.$anonfun$checkpointRecoveryOffsetsInDir$1(LogManager.scala:679)
at
kafka.log.LogManager.$anonfun$checkpointRecoveryOffsetsInDir$1$adapted(LogManager.scala:675)
at kafka.log.LogManager$$Lambda$553/0x00000000067b3f30.apply(Unknown Source)
at scala.Option.foreach(Option.scala:437)
at kafka.log.LogManager.checkpointRecoveryOffsetsInDir(LogManager.scala:675)
at kafka.log.LogManager.$anonfun$shutdown$10(LogManager.scala:546)
at kafka.log.LogManager.$anonfun$shutdown$10$adapted(LogManager.scala:539)
at kafka.log.LogManager$$Lambda$548/0x00000000067b1230.apply(Unknown Source)
at
kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
at kafka.log.LogManager$$Lambda$549/0x00000000067b1b30.apply(Unknown Source)
at scala.collection.mutable.HashMap$Node.foreachEntry(HashMap.scala:633)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:499)
at kafka.log.LogManager.shutdown(LogManager.scala:539)
at kafka.server.KafkaServer.$anonfun$shutdown$19(KafkaServer.scala:765)
at
kafka.server.KafkaServer$$Lambda$528/0x00000000c23c3630.apply$mcV$sp(Unknown
Source)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68)
at kafka.server.KafkaServer.shutdown(KafkaServer.scala:765)
at kafka.server.KafkaServer.startup(KafkaServer.scala:466)
at kafka.Kafka$.main(Kafka.scala:109)
at kafka.Kafka.main(Kafka.scala)
{panel}
This problem is a blocker issue on Kafka 3.0.0 above on z/OS. We suspect that
the attempt to fsync the directory should just be skipped on z/OS. We have
tried to skip fsync code and then Kafka 3.0.0 can be started successfully.
> Failure on ZOS due to IOException when attempting to fsync the parent
> directory
> -------------------------------------------------------------------------------
>
> Key: KAFKA-13674
> URL: https://issues.apache.org/jira/browse/KAFKA-13674
> Project: Kafka
> Issue Type: Bug
> Components: clients
> Affects Versions: 3.0.0
> Reporter: Hong Yi Zhang
> Priority: Major
>
> It appears to be the similar issue with
> [KAFKA-13391|https://issues.apache.org/jira/browse/KAFKA-13391] on ZOS due to
> fsync the parent directory. Kafka 3.0.0 failed to start on z/OS against the
> below error:
>
>
> {panel}
> ERROR Error while writing to checkpoint file
> /xxxx/recovery-point-offset-checkpoint (kafka.server.LogDirFailureChannel)
> java.io.IOException: EDC5121I Invalid argument.
> at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
> at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:110)
> at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:400)
> at org.apache.kafka.common.utils.Utils.flushDir(Utils.java:959)
> at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:944)
> at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:918)
> at
> org.apache.kafka.server.common.CheckpointFile.write(CheckpointFile.java:98)
> at
> kafka.server.checkpoints.CheckpointFileWithFailureHandler.write(CheckpointFileWithFailureHandler.scala:37)
> at
> kafka.server.checkpoints.OffsetCheckpointFile.write(OffsetCheckpointFile.scala:68)
> at
> kafka.log.LogManager.$anonfun$checkpointRecoveryOffsetsInDir$1(LogManager.scala:679)
> at
> kafka.log.LogManager.$anonfun$checkpointRecoveryOffsetsInDir$1$adapted(LogManager.scala:675)
> at kafka.log.LogManager$$Lambda$553/0x00000000067b3f30.apply(Unknown Source)
> at scala.Option.foreach(Option.scala:437)
> at kafka.log.LogManager.checkpointRecoveryOffsetsInDir(LogManager.scala:675)
> at kafka.log.LogManager.$anonfun$shutdown$10(LogManager.scala:546)
> at kafka.log.LogManager.$anonfun$shutdown$10$adapted(LogManager.scala:539)
> at kafka.log.LogManager$$Lambda$548/0x00000000067b1230.apply(Unknown Source)
> at
> kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
> at kafka.log.LogManager$$Lambda$549/0x00000000067b1b30.apply(Unknown Source)
> at scala.collection.mutable.HashMap$Node.foreachEntry(HashMap.scala:633)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:499)
> at kafka.log.LogManager.shutdown(LogManager.scala:539)
> at kafka.server.KafkaServer.$anonfun$shutdown$19(KafkaServer.scala:765)
> at
> kafka.server.KafkaServer$$Lambda$528/0x00000000c23c3630.apply$mcV$sp(Unknown
> Source)
> at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68)
> at kafka.server.KafkaServer.shutdown(KafkaServer.scala:765)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:466)
> at kafka.Kafka$.main(Kafka.scala:109)
> at kafka.Kafka.main(Kafka.scala)
> {panel}
>
> This problem is a blocker issue on Kafka 3.0.0 above on z/OS. We suspect that
> the attempt to fsync the directory should just be skipped on z/OS. We have
> tried to skip fsync code and then Kafka 3.0.0 can be started successfully.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)