xianyouQ commented on issue #10824: URL: https://github.com/apache/iceberg/issues/10824#issuecomment-2264412217
"increase the chances of rewriting the same dataset multiple times" As [RussellSpitzer](https://github.com/RussellSpitzer) said, the rewrite command with `from-snapshot` would ignore files that have been compacted. I think this is unavoidable for tables that are frequently modified, because even if the dataset has been compacted, as the previous data is subsequently updated, the associated delete file become larger and larger, and the dataset needs to be merged again at this time. In our use case, we would set `delete-file-threshold` to a valid value. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org