shanielh commented on issue #12838: URL: https://github.com/apache/iceberg/issues/12838#issuecomment-2873680943
> Amoro tried to resolve this issue by [another may](https://github.com/apache/amoro/blob/master/amoro-ams/src/main/java/org/apache/amoro/server/utils/IcebergTableUtil.java#L148) > > It determines which delete files are dangling by comparing the delete files involved in the table scan with the delete files returned in the table entries table. > > It will incur higher resource consumption, but it can precisely identify all dangling delete files. Using scan is slow since it returns the same delete files over and over due to the nature of Iceberg. You should probably "run scan" by going over the manifest files and indexing by partition the minimum data sequence number and then delete per partition delete files that their sequence number is lower than that sequence number, it's a bit more tricky since partitioning in iceberg can evolve. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org