rdblue commented on code in PR #11238:
URL: https://github.com/apache/iceberg/pull/11238#discussion_r1800243797


##########
format/puffin-spec.md:
##########
@@ -123,6 +123,49 @@ The blob metadata for this blob may include following 
properties:
 
 - `ndv`: estimate of number of distinct values, derived from the sketch.
 
+#### `delete-vector-v1` blob type
+
+A serialized delete vector (bitmap) that represents the positions of rows in a
+file that are deleted.  A set bit at position P indicates that the row at
+position P is deleted.
+
+The vector supports positive 64-bit positions, but is optimized for cases where
+most positions fit in 32 bits by using a collection of 32-bit Roaring bitmaps.
+64-bit positions are divided into a 32-bit "key" using the most significant 4
+bytes and a 32-bit sub-position using the least significant 4 bytes. For each
+key in the set of positions, a 32-bit Roaring bitmap is maintained to store a
+set of 32-bit sub-positions for that key.
+
+To test whether a certain position is set, its most significant 4 bytes (the
+key) are used to find a 32-bit bitmap and the least significant 4 bytes (the
+sub-position) are tested for inclusion in the bitmap. If a bitmap is not found
+for the key, then it is not set.
+
+The serialized blob contains:
+* The length of the vector and magic bytes stored as 4 bytes, little-endian

Review Comment:
   > So we will be specifically storing a "delta length" which is the real 
length - 4 bytes? But in manifests we will keep the full blob length including 
the crc bytes?
   
   I think it would be the blob length - 8 bytes (4 for the CRC and 4 for the 
length).
   
   > I wouldn't be worried about older Delta readers at all.
   > 
   > * compatibility with Delta readers isn't a project goal, is it?
   > * compatibility with some old software that we have no control over, 
shouldn't be a project goal either?
   
   These are good questions to consider and we will need to come to consensus 
as a community.
   
   I think there is value in supporting compatibility with older Delta readers, 
but I acknowledge that this may be my perspective because my employer has a lot 
of Delta customers that we are going support now and in the future.
   
   The main use case for maintaining compatibility with the Delta format is 
that it's hard to move old jobs to new code. We see a similar issue in Hive to 
Iceberg migrations, too, where unknown older readers prevent migration because 
they are hard to track down and often read files directly from the backing 
object store. I'd like to avoid the same problem here, where all readers need 
to be identified and migrated at the same time. Doing that requires temporarily 
producing Delta metadata, which this makes possible.
   
   The second reason for maintaining compatibility is that we want for the 
formats to become more similar. My hope is that we can work across both 
communities and come up with a common metadata format in a future version. 
Maintaining compatibility in cases like this keeps our options open: if we have 
compatible data layers, then it's easier to build a compatible metadata layer.
   
   Other people may not share those goals, but I think the decision comes down 
to how much is being compromised for compatibility. I don't think it is too 
much:
   * A 4-byte length field that Iceberg doesn't need
   * A 4-byte CRC that is useful to add
   * Using big endian for this blob (mixed endian if you consider RoaringBitmap 
is LE)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to