Re: [PR] Expose Avro reader to PyIceberg [iceberg-rust]

via GitHub Wed, 21 May 2025 23:59:51 -0700


liurenjie1024 commented on PR #1328:
URL: https://github.com/apache/iceberg-rust/pull/1328#issuecomment-2900132681


   > Thanks everyone for chiming in here. Let me summarize the discussion. I 
think there is consensus that the callback is not ideal.
   > 
   > 1. Supply required information to construct the summaries
   >    
   >    1. Instead of having the `Fn(i32) -> Result<Option<StructType>>` 
provider, we could pass in a `HashMap<i32, StructType>`. We would bind all the 
`PartitionSpec`'s in PyIceberg. This is relative straightforward, but comes at 
a cost when there are many PartitionSpecs (which should be okay for the 
majority of tables).
   >    2. What @kevinjqliu suggested [Expose Avro reader to PyIceberg #1328 
(comment)](https://github.com/apache/iceberg-rust/pull/1328#discussion_r2094174778)
 suggested. Pass in the current `Schema` and `PartitionSpec`'s to Iceberg-Rust 
where we can do the lazy binding on the Iceberg-Rust side.
   >    3. Go all the way, and convert the `TableMetadata` to Iceberg-Rust, 
this is probably where we end up at some point at some day, but require a lot 
of scaffolding.
   > 2. Deserialize in `Vec<u8>` instead of a `Datum`, and convert them later 
into the actual type. This removes the dependency on the `Schema` and the 
`PartitionSpec`'s.
   > 
   > I'm leaning towards 2 since that aligns the best with PyIceberg, where we 
can deserialize the manifest-list without having to know about the schema. I 
would make sure that we have consensus before moving into a certain direction, 
and happy to follow up on that.
   
   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Expose Avro reader to PyIceberg [iceberg-rust]

Reply via email to