rdblue commented on code in PR #11240:
URL: https://github.com/apache/iceberg/pull/11240#discussion_r1794325342


##########
format/spec.md:
##########
@@ -841,14 +842,38 @@ Notes:
 
 ## Delete Formats
 
-This section details how to encode row-level deletes in Iceberg delete files. 
Row-level deletes are not supported in v1.
+This section details how to encode row-level deletes in Iceberg delete files. 
Row-level deletes are added by v2 and are not supported in v1. Deletion vectors 
are added in v3 and are not supported in v2 or earlier. Position delete files 
must not be added to v3 tables, but existing position delete files are valid.
 
-Row-level delete files are valid Iceberg data files: files must use valid 
Iceberg formats, schemas, and column projection. It is recommended that delete 
files are written using the table's default file format.
+There are three types of row-level deletes:
+* Deletion vectors (DVs) identify deleted rows within a single referenced data 
file by position in a bitmap
+* Position delete files identify deleted rows by file location and row position
+* Equality delete files identify deleted rows by the value of one or more 
columns
 
-Row-level delete files are tracked by manifests, like data files. A separate 
set of manifests is used for delete files, but the manifest schemas are 
identical.
+Deletion vectors are a binary representation of deletes for a single data file 
that is more efficient at execution time than position delete files. Unlike 
equality or position delete files, there can be at most one deletion vector for 
a given data file in a table. Writers must detect concurrently added deletion 
vectors and merge multiple vectors for a given data file before committing.

Review Comment:
   Yeah, I think you're right that stating the requirement is the correct 
approach. We can say that writers must ensure that there is only one delete 
vector for a data file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to