viirya commented on code in PR #295: URL: https://github.com/apache/iceberg-rust/pull/295#discussion_r1548809949
########## crates/iceberg/src/arrow.rs: ########## @@ -20,24 +20,38 @@ use async_stream::try_stream; use futures::stream::StreamExt; use parquet::arrow::{ParquetRecordBatchStreamBuilder, ProjectionMask}; +use std::collections::HashMap; use crate::io::FileIO; use crate::scan::{ArrowRecordBatchStream, FileScanTask, FileScanTaskStream}; -use crate::spec::SchemaRef; +use crate::spec::{Datum, PrimitiveLiteral, SchemaRef}; use crate::error::Result; +use crate::expr::{ + BinaryExpression, BoundPredicate, BoundReference, PredicateOperator, SetExpression, + UnaryExpression, +}; use crate::spec::{ ListType, MapType, NestedField, NestedFieldRef, PrimitiveType, Schema, StructType, Type, }; use crate::{Error, ErrorKind}; +use arrow_arith::boolean::{and, is_not_null, is_null, not, or}; +use arrow_array::{ + BooleanArray, Datum as ArrowDatum, Float32Array, Float64Array, Int32Array, Int64Array, +}; +use arrow_ord::cmp::{eq, gt, gt_eq, lt, lt_eq, neq}; use arrow_schema::{DataType, Field, Fields, Schema as ArrowSchema, TimeUnit}; +use bitvec::macros::internal::funty::Fundamental; +use parquet::arrow::arrow_reader::{ArrowPredicate, ArrowPredicateFn, RowFilter}; +use parquet::schema::types::{SchemaDescriptor, Type as ParquetType}; use std::sync::Arc; /// Builder to create ArrowReader pub struct ArrowReaderBuilder { batch_size: Option<usize>, file_io: FileIO, schema: SchemaRef, + predicates: Option<Vec<BoundPredicate>>, Review Comment: This is because the Parquet API `RowFilter` takes `predicates: Vec<Box<dyn ArrowPredicate>>`. This is `RowFilter`'s doc: ``` /// A [`RowFilter`] allows pushing down a filter predicate to skip IO and decode /// /// This consists of a list of [`ArrowPredicate`] where only the rows that satisfy all /// of the predicates will be returned. ``` So I think it is conjunction relationships between these predicates. > Since we already have And/Or as part of BoundPredicate variant, how about just BoundPredicate? Yea, I can use just one `BoundPredicate`. So users can define conjunctions in one single predicate using `And`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org