szehon-ho commented on code in PR #6771: URL: https://github.com/apache/iceberg/pull/6771#discussion_r1102121949
########## docs/spark-queries.md: ########## @@ -346,6 +346,9 @@ SELECT * FROM prod.db.table.partitions; Note: For unpartitioned tables, the partitions table will contain only the record_count and file_count columns. +Note2: +The output of the above query might differ between having copy-on-write or merge-on-read strategies. E.g. delete files with MOR strategy aren't applyied when producing the content of the partitions metadata table. As a result if you have renamed a partition (by updating the value of a partition column) then you would see both the 'old' and the 'new' one until you do a rewrite of delete/data files. Review Comment: Actualy I checked with @aokolnychyi , it seems the expected behavior. Delete From may choose metadata-only delete, but update does not, as it is kind of an edge case. What do you guys think about something simpler? Rather than mentioning specific use-case. ``` The partitions metadata table shows partitions with data files or delete files in the current snapshot. However, delete files are not applied, and so in some cases partitions may be shown even though all their data rows are marked deleted by delete files. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org