amogh-jahagirdar commented on issue #6800: URL: https://github.com/apache/iceberg/issues/6800#issuecomment-1426951648
Right I remember seeing this during testing when working on some changes to the expire snapshots behavior. Ultimately though as a default behavior I think this is expected, because the metadata file is the root of the metadata tree and is the source of truth for snapshots, and the associated manifest list/manifest metadata files which get cleaned up during expire snapshots. So from this angle, the `ExpireSnapshot`s API should just focus on cleaning up snapshots + manifest list/manifest files based on a single metadata file. `RemoveOrphanFiles` should also be focused on just removing the invalid files themselves and should not get in the business of rewriting metadata. **For streaming use-cases would it make more sense to have some kind of "retain last n" metadata log entries table property?** That way the number of metadata log entries is always bounded and you can configure according to your desires. cc @stevenzwu who's more familiar on the flink side. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org