zstraw commented on issue #4550: URL: https://github.com/apache/iceberg/issues/4550#issuecomment-1340655807
> I have solved this problem. Thank you. My problem mainly occurs when InMemoryLockManager releases the heartbeat of the lock and reports a NullPointerException; I rewrote InMemoryLockManager to solve this problem. 在 2022年12月6日 ***@***.***> 写道: After deeping into iceberg code and the log, I can reproduce it in debugging locally. The scenario may happens in the process of Flink cancelling. IcebergFileCommitter is going to commit file. In the step of rename metadata.json(org.apache.iceberg.hadoop.HadoopTableOperations#renameToFinal), org.apache.hadoop.ipc.Client.call encounters InterruptedIOException. I suspect it comes from Flink task cancelling. On the other hand, Hdfs has renamed the metada.json file sucessfully. After rename fails, it's supposed to retry. But the thread encounters InterruptedException in sleeping(org.apache.iceberg.util.Tasks#runTaskWithRetry). Then it will throw a RuntimeException. And the version-hint will not be updated. The RuntimeException lea ds to rollback in org.apache.iceberg.BaseTransaction(#cleanUpOnCommitFailure), which will delete manifest list (snap-XXX). — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***> But I think this problem is still a bug in Iceberg's commit procedure. There are several tasks encountering snap-XXX.avro files lost in our env. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org