sqd opened a new issue, #13568: URL: https://github.com/apache/iceberg/issues/13568
### Apache Iceberg version 1.9.1 (latest release) ### Query engine None ### Please describe the bug 🐞 When expiring snapshots with standard iceberg API, some data files could be erroneously deleted, leaving the table in a corrupted state. The bug is caused by a combination of faulty logics: 1. [This line in org.apache.iceberg.IncrementalFileCleanup](https://github.com/apache/iceberg/blob/apache-iceberg-1.9.1/core/src/main/java/org/apache/iceberg/IncrementalFileCleanup.java#L84). When there are multiple refs, this does not guarantee to always return the latest ref. 2. [These lines in org.apache.iceberg.RemoveSnapshots](https://github.com/apache/iceberg/blob/apache-iceberg-1.9.1/core/src/main/java/org/apache/iceberg/RemoveSnapshots.java#L364-L389). IncrementalFileCleanup strategy is chosen when the current snapshot **AFTER** expiration (has been commited at this point) has only one ref, but the snapshot **BEFORE** expiration could have multiple refs. 3. When point 1 returns an old snapshot (a tag or branch), an expired manifest after that snapshot who a. added files, and b. have not been carried over to the latest snapshot (maybe one of the referenced files was deleted) [will be erroneously classified as manifest to revert](https://github.com/apache/iceberg/blob/apache-iceberg-1.9.1/core/src/main/java/org/apache/iceberg/IncrementalFileCleanup.java#L231-L243). This is because [the ancestor snapshot set is calculated wrong](https://github.com/apache/iceberg/blob/apache-iceberg-1.9.1/core/src/main/java/org/apache/iceberg/IncrementalFileCleanup.java#L89-L96), because of point 1 and 2. 4. [All the data files referenced in that manifest are deleted](https://github.com/apache/iceberg/blob/apache-iceberg-1.9.1/core/src/main/java/org/apache/iceberg/IncrementalFileCleanup.java#L305-L320), which is wrong. I have attached a code snippet to reproduce this bug. I was able to reproduce on iceberg 1.6.1 and 1.9.1, running x64 version of temurin 11.0.27, 17.0.15 and 21.0.7 on a M1 macbook. However, note that point 1 is indeterministic, so it's not 100% guaranteed that this will reproduce on every setup. [reproduce.txt](https://github.com/user-attachments/files/21259782/reproduce.txt) ### Willingness to contribute - [ ] I can contribute a fix for this bug independently - [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [ ] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org