dramaticlly opened a new pull request, #14287: URL: https://github.com/apache/iceberg/pull/14287
This patch add `retainOrphanedDataFiles` to the expire snapshots interface to allow retain the orphaned data files and also update both cleanup strategies. Most of time, we want to remove such data files as part of file clean up followed by snapshot expiration as those files are no longer referenced by any active snapshots. However, when the underlying data files are shared/shallow copied to other table, or when parquet files being added by the `add-files` procedure https://iceberg.apache.org/docs/latest/spark-procedures/#add_files, we might want to keep the data files. This option is disabled by default Can you help take a look? @stevenzwu @amogh-jahagirdar -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
