wgtmac commented on code in PR #11240:
URL: https://github.com/apache/iceberg/pull/11240#discussion_r1815226468


##########
format/spec.md:
##########
@@ -841,19 +855,45 @@ Notes:
 
 ## Delete Formats
 
-This section details how to encode row-level deletes in Iceberg delete files. 
Row-level deletes are not supported in v1.
+This section details how to encode row-level deletes in Iceberg delete files. 
Row-level deletes are added by v2 and are not supported in v1. Deletion vectors 
are added in v3 and are not supported in v2 or earlier. Position delete files 
must not be added to v3 tables, but existing position delete files are valid.
+
+There are three types of row-level deletes:
+* Deletion vectors (DVs) identify deleted rows within a single referenced data 
file by position in a bitmap
+* Position delete files identify deleted rows by file location and row 
position (**deprecated**)
+* Equality delete files identify deleted rows by the value of one or more 
columns
+
+Deletion vectors are a binary representation of deletes for a single data file 
that is more efficient at execution time than position delete files. Unlike 
equality or position delete files, there can be at most one deletion vector for 
a given data file in a table. Writers must ensure that there is at most one 
deletion vector per data file and must merge new deletes with existing vectors 
or position delete files.
+
+Row-level delete files (both equality and position delete files) are valid 
Iceberg data files: files must use valid Iceberg formats, schemas, and column 
projection. It is recommended that these delete files are written using the 
table's default file format.
+
+Row-level delete files and deletion vectors are tracked by manifests. A 
separate set of manifests is used for delete files and DVs, but the same 
manifest schema is used for both data and delete manifests. Deletion vectors 
are tracked individually by file location, offset, and length within the 
containing file. Deletion vector metadata must include the referenced data file.
+
+Both position and equality delete files allow encoding deleted row values with 
a delete. This can be used to reconstruct a stream of changes to a table.
+
 
-Row-level delete files are valid Iceberg data files: files must use valid 
Iceberg formats, schemas, and column projection. It is recommended that delete 
files are written using the table's default file format.
+### Deletion Vectors
 
-Row-level delete files are tracked by manifests, like data files. A separate 
set of manifests is used for delete files, but the manifest schemas are 
identical.
+Deletion vectors identify deleted rows of a file by encoding deleted positions 
in a bitmap. A set bit at position P indicates that the row at position P is 
deleted.
 
-Both position and equality deletes allow encoding deleted row values with a 
delete. This can be used to reconstruct a stream of changes to a table.
+These vectors are stored using the `delete-vector-v1` blob definition from the 
[Puffin spec][puffin-spec].
 
+Deletion vectors support positive 64-bit positions, but are optimized for 
cases where most positions fit in 32 bits by using a collection of 32-bit 
Roaring bitmaps. 64-bit positions are divided into a 32-bit "key" using the 
most significant 4 bytes and a 32-bit sub-position using the least significant 
4 bytes. For each key in the set of positions, a 32-bit Roaring bitmap is 
maintained to store a set of 32-bit sub-positions for that key.
+
+To test whether a certain position is set, its most significant 4 bytes (the 
key) are used to find a 32-bit bitmap and the least significant 4 bytes (the 
sub-position) are tested for inclusion in the bitmap. If a bitmap is not found 
for the key, then it is not set.
+
+Delete manifests track deletion vectors individually by the containing file 
location (`file_path`), starting offset of the DV magic bytes (`blob_offset`), 
and total length of the deletion vector blob (`blob_size_in_bytes`). Multiple 
deletion vectors can be stored in the same file. There are no restrictions on 
the data files that can be referenced by deletion vectors in the same Puffin 
file.
+
+At most one deletion vector is allowed per data file in a table. If a DV is 
written for a data file, it must replace all previously written position delete 
files so that when a DV is present, readers can safely ignore matching position 
delete files.

Review Comment:
   Should we explicitly state that this is for a single snapshot? Each data 
file may have different DVs in different snapshots.



##########
format/spec.md:
##########
@@ -568,8 +572,10 @@ The schema of a manifest file is a struct called 
`manifest_entry` with the follo
 |            | _required_ | _required_ | **`134  content`**                | 
`int` with meaning: `0: DATA`, `1: POSITION DELETES`, `2: EQUALITY DELETES` | 
Type of content stored by the data file: data, equality deletes, or position 
deletes (all v1 files are data files)                                           
                                                      |
 | _required_ | _required_ | _required_ | **`100  file_path`**              | 
`string`                                                                    | 
Full URI for the file with FS scheme                                            
                                                                                
                                                   |
 | _required_ | _required_ | _required_ | **`101  file_format`**            | 
`string`                                                                    | 
String file format name, avro, orc or parquet                                   
                                                                                
                                                   |
+| _required_ | _required_ | _required_ | **`101  file_format`**            | 
`string`                                                                    | 
String file format name, `avro`, `orc`, `parquet`, or `puffin`                  
                                                                                
                                                   |
 | _required_ | _required_ | _required_ | **`102  partition`**              | 
`struct<...>`                                                               | 
Partition data tuple, schema based on the partition spec output using partition 
field ids for the struct field ids                                              
                                                   |
 | _required_ | _required_ | _required_ | **`103  record_count`**           | 
`long`                                                                      | 
Number of records in this file                                                  
                                                                                
                                                   |
+| _required_ | _required_ | _required_ | **`103  record_count`**           | 
`long`                                                                      | 
Number of records in this file, or the cardinality of a deletion vector         
                                                                                
                                                   |

Review Comment:
   ```suggestion
   | _required_ | _required_ | _required_ | **`103  record_count`**           | 
`long`                                                                      | 
Number of records in this file, or the cardinality of a deletion vector         
                                                                                
                                                   |
   ```



##########
format/spec.md:
##########
@@ -568,8 +572,10 @@ The schema of a manifest file is a struct called 
`manifest_entry` with the follo
 |            | _required_ | _required_ | **`134  content`**                | 
`int` with meaning: `0: DATA`, `1: POSITION DELETES`, `2: EQUALITY DELETES` | 
Type of content stored by the data file: data, equality deletes, or position 
deletes (all v1 files are data files)                                           
                                                      |
 | _required_ | _required_ | _required_ | **`100  file_path`**              | 
`string`                                                                    | 
Full URI for the file with FS scheme                                            
                                                                                
                                                   |
 | _required_ | _required_ | _required_ | **`101  file_format`**            | 
`string`                                                                    | 
String file format name, avro, orc or parquet                                   
                                                                                
                                                   |
+| _required_ | _required_ | _required_ | **`101  file_format`**            | 
`string`                                                                    | 
String file format name, `avro`, `orc`, `parquet`, or `puffin`                  
                                                                                
                                                   |

Review Comment:
   ```suggestion
   | _required_ | _required_ | _required_ | **`101  file_format`**            | 
`string`                                                                    | 
String file format name, `avro`, `orc`, `parquet`, or `puffin`                  
                                                                                
                                                   |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to