Re: [PR] feat: suport read/write Manifest [iceberg-rust]

via GitHub Mon, 11 Dec 2023 23:20:54 -0800


Fokko commented on code in PR #79:
URL: https://github.com/apache/iceberg-rust/pull/79#discussion_r1423535271



##########
crates/iceberg/src/spec/transform.rs:
##########
@@ -126,17 +126,20 @@ pub enum Transform {
 impl Transform {
     /// Get the return type of transform given the input type.
     /// Returns `None` if it can't be transformed.
-    pub fn result_type(&self, input_type: &Type) -> Option<Type> {
+    pub fn result_type(&self, input_type: &Type) -> Result<Type> {

Review Comment:
   I think there is a difference, let me explain! In Iceberg it is possible to 
have your custom transform, for example if you want to do something very 
specific, or you want to influence the distribution when it comes to 
partitioning. It can be that the transform is not available in the client, and 
then we cast it to an `UnknownTransform`: 
https://github.com/apache/iceberg-python/blob/8c8abb5c4c258e32941110a9ce0938e1328290b3/pyiceberg/transforms.py#L688
   
   What this essentially means is that we don't know how it is transformed, and 
we cannot apply speedups. For example, `bucket[16](34) = 3` then we know in 
which bucket the data is that we're looking for. But if there is some custom 
partitioning then we just have to look through all the buckets.
   
   To this specific case of the type, in PyIceberg we return a `StringType()`, 
that's borrowed from Java where you just cast it to a string. I don't think 
that's used anywhere since we don't call this on the `UnknownTransform`. The 
Type cannot be None, so I think having `Result` is the right thing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: suport read/write Manifest [iceberg-rust]

Reply via email to