[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6771: Docs: Document that partitions metadata table might show 'old' partitions

via GitHub Wed, 08 Feb 2023 09:50:35 -0800


szehon-ho commented on code in PR #6771:
URL: https://github.com/apache/iceberg/pull/6771#discussion_r1100488895



##########
docs/spark-queries.md:
##########
@@ -346,6 +346,9 @@ SELECT * FROM prod.db.table.partitions;
 Note:
 For unpartitioned tables, the partitions table will contain only the 
record_count and file_count columns.
 
+Note2:
+The output of the above query might differ between having copy-on-write or 
merge-on-read strategies. E.g. delete files with MOR strategy aren't applyied 
when producing the content of the partitions metadata table. As a result if you 
have renamed a partition (by updating the value of a partition column) then you 
would see both the 'old' and the 'new' one until you do a rewrite of 
delete/data files.

Review Comment:
   I actually left a comment on the issue, now I am wondering why we don't use 
metadata only delete in this case, to go to a codepath where we delete the 
manifests entry rather than avoid writing delete files.  I wonder if its 
because of doing an update
   
   That being said, there may be use-cases where we delete enough of the 
partition using delete files in separate commit, that results in this, but I am 
not sure if this specific use case should result in this by design and we 
should put it specifically.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6771: Docs: Document that partitions metadata table might show 'old' partitions

Reply via email to