dramaticlly commented on issue #8702: URL: https://github.com/apache/iceberg/issues/8702#issuecomment-1760455077
data compaction only change physical files layout but not the data visible to users. Consider you originally have 1000 records with 10 duplicates, after deduplication it would be 990 records and also file layout change, I think deduplication (with ability to identify the row based on primary key or unique row identifier) probably need its own action/procedure instead of rely on data compaction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org