a-agmon commented on code in PR #588: URL: https://github.com/apache/iceberg-rust/pull/588#discussion_r1739863150
########## crates/integrations/datafusion/src/physical_plan/scan.rs: ########## @@ -138,3 +150,231 @@ async fn get_batch_stream( Ok(Box::pin(stream)) } + +/// convert DataFusion filters ([`Expr`]) to an iceberg [`Predicate`] +/// if none of the filters could be converted, return `None` +/// if the conversion was successful, return the converted predicates combined with an AND operator +fn convert_filters_to_predicate(filters: &[Expr]) -> Option<Predicate> { + filters + .iter() + .filter_map(expr_to_predicate) + .reduce(Predicate::and) +} + +/// Recuresivly converting DataFusion filters ( in a [`Expr`]) to an Iceberg [`Predicate`]. +/// +/// This function currently handles the conversion of DataFusion expression of the following types: +/// +/// 1. Simple binary expressions (e.g., "column < value") +/// 2. Compound AND expressions (e.g., "x < 1 AND y > 10") +/// 3. Compound OR expressions (e.g., "x < 1 OR y > 10") +/// +/// For AND expressions, if one part of the expression can't be converted, +/// the function will still return a predicate for the part that can be converted. +/// For OR expressions, if any part can't be converted, the entire expression +/// will fail to convert. +/// +/// # Arguments +/// +/// * `expr` - A reference to a DataFusion [`Expr`] to be converted. +/// +/// # Returns +/// +/// * `Some(Predicate)` if the expression could be successfully converted. +/// * `None` if the expression couldn't be converted to an Iceberg predicate. +fn expr_to_predicate(expr: &Expr) -> Option<Predicate> { Review Comment: Thanks @sdd, @FANNG1 , I appriciate it I will try to refactor this, though I am wondering whether Visitor will indeed be the most suitable. Let me know what you think, but I think visitor shines when we have to run different logic on different kinds of elements (chained in some way) while we want to keep the logic in one place - i.e., the Visitor. whereas here we have one kind of element - Expr - which is an enum that can be deconstructed in different ways, for example - ```rust Expr::Column(col), op, Expr::Literal(lit) OR Expr::BinaryExpr(left_expr), Operator::Or, Expr::BinaryExpr(right_expr) ``` so what Im trying to say is that using visitor will simply move the matching complexity to another place - to the visitor. Does this make sense? I will continue to try and refactor this but please let me know what you think -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org