shanielh commented on issue #12838:
URL: https://github.com/apache/iceberg/issues/12838#issuecomment-2873680943

   > Amoro tried to resolve this issue by [another 
may](https://github.com/apache/amoro/blob/master/amoro-ams/src/main/java/org/apache/amoro/server/utils/IcebergTableUtil.java#L148)
   > 
   > It determines which delete files are dangling by comparing the delete 
files involved in the table scan with the delete files returned in the table 
entries table.
   > 
   > It will incur higher resource consumption, but it can precisely identify 
all dangling delete files.
   
   Using scan is slow since it returns the same delete files over and over due 
to the nature of Iceberg. You should probably "run scan" by going over the 
manifest files and indexing by partition the minimum data sequence number and 
then delete per partition delete files that their sequence number is lower than 
that sequence number, it's a bit more tricky since partitioning in iceberg can 
evolve. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to