sdd commented on code in PR #1308:
URL: https://github.com/apache/iceberg-rust/pull/1308#discussion_r2083937157


##########
crates/iceberg/src/arrow/reader.rs:
##########
@@ -1103,6 +1104,7 @@ impl BoundPredicateVisitor for PredicateConverter<'_> {
 
             Ok(Box::new(move |batch| {
                 let left = project_column(&batch, idx)?;
+                let literal = cast_literal_if_required(Arc::clone(&literal), 
left.data_type())?;

Review Comment:
   We are calling `Arc::clone()` on the `&literal` every time we invoke 
`cast_literal_if_required`. Can we move this into `cast_literal_if_required` to 
avoid the repetition?



##########
crates/iceberg/src/arrow/reader.rs:
##########
@@ -1370,14 +1381,35 @@ impl<R: FileRead> AsyncFileReader for 
ArrowFileReader<R> {
     }
 }
 
+/// The Arrow type of an array that the Parquet reader reads may not match the 
exact Arrow type
+/// that Iceberg uses for literals - but they are effectively the same logical 
type,
+/// i.e. LargeUtf8 and Utf8 or Utf8View and Utf8 or Utf8View and LargeUtf8.
+///
+/// The Arrow compute kernels that we use must match the type exactly, so 
first cast the literal
+/// into the type of the batch we read from Parquet before sending it to the 
compute kernel.
+fn cast_literal_if_required(

Review Comment:
   Maybe `try_cast_literal` since it returns a result



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to