dentiny opened a new issue, #1272:
URL: https://github.com/apache/iceberg-rust/issues/1272

   ### Is your feature request related to a problem or challenge?
   
   Hi team, this feature request is half a question on puffin / deletion vector 
progress, and half on feature request for manifest support.
   
   As stated in the [spec](https://iceberg.apache.org/spec/#deletion-vectors):
   > Delete manifests track deletion vectors individually by the containing 
file location (file_path), starting offset of the DV blob (content_offset), and 
total length of the blob (content_size_in_bytes). Multiple deletion vectors can 
be stored in the same file. There are no restrictions on the data files that 
can be referenced by deletion vectors in the same Puffin file.
   
   My understanding is, in the manifest file, apart from data file tracking, 
there're records for puffin files, example:
   ```json
   {
     "snapshot_id": 4439194908709239593,
     "sequence_number": null,
     "file_sequence_number": null,
     "data_file": {
       "content": 0,
       "file_path": 
"file:///tmp/iceberg-test/default/test_table/data/iceberg-data-00000.parquet",
       "file_format": "PARQUET",
       ...,
     },
     "puffin_file": {
       "file_path": "file:///tmp/dir/puffin.bin",
       "file_format": "PUFFIN",
       "content": DELETION_VECTOR_TYPE,
       "content_offset": ...,
       "content_size_in_bytes": ...,
     }
   }
   ```
   
   I'm aware there's an 
[epic](https://github.com/apache/iceberg-rust/issues/744) about puffin 
progress, but I don't see any change on manifest side in the PRs.
   
   Curious am I mis-understanding for the spec, is it already implemented but 
I'm not aware of, or we have plans to implement that in the future?
   
   Thank you!
   
   ### Describe the solution you'd like
   
   _No response_
   
   ### Willingness to contribute
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to