fengguangyuan commented on issue #9741: URL: https://github.com/apache/iceberg/issues/9741#issuecomment-1951823635
Hi, there. I believe it's the protection for the correctness of the existed data, instead of a bug. > Basic logics of parallel write: possibly read the same data, but never commit metadata based on the same snapshot. From line `at org.apache.iceberg.BaseOverwriteFiles.apply(BaseOverwriteFiles.java:31) ~[iceberg-spark3-runtime.jar:?] ` in the stacktrace, we can know the thread is overwriting files but failed with losing delete files. Considering Compact & Overwrite tasks running in parallel (Expire task does nothing), it's possible that they are holding the same snapshot (including the same view of the delete files) to do their works, but at some point a Compact task `committed before the Overwrite task `trying to call the internal method to commit metadata, obviously these overwrite tasks will fail with `ValidationException` on the latest snapshot (old delete files are invisible) during in committing. Hope I had explained some key points to help you to understand the commit logic in parallel. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org