cccs-jc commented on PR #8980: URL: https://github.com/apache/iceberg/pull/8980#issuecomment-1853964957
so I did more digging. On our production tables I search for all manifests which have a `existing_data_files_count > 0` and `added_data_files_count > 0` and I find none. This leads me to believe that a commit will either be an append with `added_data_files_count` **or** a rewrite with `existing_data_files_count` . This query returns no results: ```sql select distinct added_snapshot_id from catalog1.schema1.table1.manifests where existing_data_files_count > 0 and added_data_files_count > 0 ``` I can search for manifests which have `existing_data_files_count > 0` and join those results to the snapshots. ```sql select * from catalog1.schema1.table1.snapshots where snapshot_id in ( select distinct added_snapshot_id from catalog1.schema1.table1.manifests where existing_data_files_count > 0 ) ``` Manifests with the snapshot_id they belong to  Their corresponding snapshots are all rewrite snapshots:  When streaming we skip over rewrites snapshots. Thus we will never encounter a manifest with an `existing_data_files_count > 0`. So this calling this in the code does nothing `+ existingFilesCount();` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org