kevinjqliu opened a new issue, #12823: URL: https://github.com/apache/iceberg/issues/12823
### Apache Iceberg version 1.8.1 (latest release) ### Query engine Spark ### Please describe the bug 🐞 MoR delete with positional delete file does not properly update the `total-records` in Snapshot summary. This can be seen by the pyiceberg example [here](https://github.com/apache/iceberg-python/pull/1926/files#diff-d875bb1b02ed1d4043a6355a53cbc35ef9eb4d862e2c8bed8007642876b3fb7bR496) where a single row is deleted but the `total-records` remains the same. CoW delete, where the data file is rewritten, does not have this problem and the `total-records` is properly decremented, as shown [here](https://github.com/apache/iceberg-python/pull/1926/files#diff-d875bb1b02ed1d4043a6355a53cbc35ef9eb4d862e2c8bed8007642876b3fb7bR525) (Although its decremented using the previously wrongly calculated `total-records`). I think this issue has persisted for quite a while. I found both #7463 and #6709. #7463 shows that the delete (`DELETE FROM default.t1 WHERE foo = 'b'`) produce an OVERWRITE snapshot with the following summary: ``` {'spark.app.id': 'local-1682689536619', 'changed-partition-count': '1', 'added-position-deletes': '1', 'total-equality-deletes': '0', 'total-position-deletes': '1', 'added-position-delete-files': '1', 'added-files-size': '1490', 'total-delete-files': '1', 'added-delete-files': '1', 'total-files-size': '2387', 'total-records': '3', 'total-data-files': '1'} ``` where `'total-records': '3',` is the same as the previous Snapshot even though a row has been deleted ### Willingness to contribute - [ ] I can contribute a fix for this bug independently - [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [ ] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org