ggadon opened a new issue, #13651: URL: https://github.com/apache/iceberg/issues/13651
### Apache Iceberg version None ### Query engine Spark ### Please describe the bug 🐞 Hey team, It seems like running snapshot expiration and `CREATE OR REPLACE TABLE` at the same time can cause table history corruption. The following happens: 1. The base metadata, X, contains snapshots for expiration 2. The replace operation loads the metadata X and uses it as base 3. Snapshot expiration runs, creates metadata X+1, and commits it successfully, deleting old snapshots' data 4. The replace operation finishes, and commits it successfully, with what seems to be X as its base, instead of X+1. It seems like the replace operation ignores commit conflicts during the replace operation ## Initial Observations If I'm looking at this correctly, the code that commits the replace transaction seems to have a logic that looks like ignores changes done to the base metadata, here: https://github.com/apache/iceberg/blob/6bd6887db1f90674ca5e20e88cc95c5f92dcb050/core/src/main/java/org/apache/iceberg/BaseTransaction.java#L376-L380 In our scenario, `current` seems to include the history before cleanup, while `base` after the lines above is actually the clean metadata without the snapshots that were just expired. When the commit occurs a few lines below, it seems like it commits the new metadata with the old history in it, causing the corruption. Am I reading this correctly? What was the logic behind this? Is it only for efficiency reasons since there is no real need to handle commit conflicts in these cases as all data files are replaced? ## On a sidenote I know there are active mailing list discussions around this ([thread](https://lists.apache.org/thread/d4hzd4cfvopvckcfw50orqksjzymd4lm)), and also an issue that was closed recently about this subject (with @RussellSpitzer 's [comment](https://github.com/apache/iceberg/issues/12738#issuecomment-3009087235) about corrupted metadata). Would love to hear your thoughts about possible next steps here. Thanks! ### Willingness to contribute - [x] I can contribute a fix for this bug independently - [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [ ] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
