Peter Rozsa created IMPALA-14908:
------------------------------------

             Summary: OPTIMIZE statement leaves equality-delete files in 
metadata
                 Key: IMPALA-14908
                 URL: https://issues.apache.org/jira/browse/IMPALA-14908
             Project: IMPALA
          Issue Type: Bug
          Components: Frontend
            Reporter: Peter Rozsa
            Assignee: Noémi Pap-Takács


OPTIMIZE uses planFiles to collect all data files with associated deletes 
during the catalog finalization phase. Iceberg's planFiles applies column-range 
statistics to prune equality-delete files from scan tasks - if a delete file's 
target value does not overlap with a data file's column bounds, it is excluded 
from that file's FileScanTask.deletes(). As a result, the rewrite operation 
never sees those equality-delete files, and they are not passed to 
rewrite.deleteFile(). The new snapshot therefore still contains the 
equality-delete files after OPTIMIZE completes.

Steps to reproduce (rollback required after execution):
OPTIMIZE TABLE functional_parquet.iceberg_v2_delete_equality;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to