szehon-ho commented on code in PR #6771:
URL: https://github.com/apache/iceberg/pull/6771#discussion_r1102121949


##########
docs/spark-queries.md:
##########
@@ -346,6 +346,9 @@ SELECT * FROM prod.db.table.partitions;
 Note:
 For unpartitioned tables, the partitions table will contain only the 
record_count and file_count columns.
 
+Note2:
+The output of the above query might differ between having copy-on-write or 
merge-on-read strategies. E.g. delete files with MOR strategy aren't applyied 
when producing the content of the partitions metadata table. As a result if you 
have renamed a partition (by updating the value of a partition column) then you 
would see both the 'old' and the 'new' one until you do a rewrite of 
delete/data files.

Review Comment:
   Actualy I checked with @aokolnychyi , it seems the expected behavior.  
Delete From may choose metadata-only delete, but update does not, as it is kind 
of an edge case.
   
   
   What do you guys think about something simpler?  Rather than mentioning 
specific use-case.
   ```
   The partitions metadata table shows partitions with data files or delete 
files in the current snapshot.  However, delete files are not applied, and so 
in some cases partitions may be shown even though all their data rows are 
marked deleted by delete files.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to