Erigara commented on code in PR #2029: URL: https://github.com/apache/iceberg-python/pull/2029#discussion_r2098978290
########## pyiceberg/expressions/visitors.py: ########## @@ -894,12 +895,17 @@ def visit_unbound_predicate(self, predicate: UnboundPredicate[L]) -> BooleanExpr def visit_bound_predicate(self, predicate: BoundPredicate[L]) -> BooleanExpression: file_column_name = self.file_schema.find_column_name(predicate.term.ref().field.field_id) + field_name = predicate.term.ref().field.name if file_column_name is None: # In the case of schema evolution, the column might not be present # in the file schema when reading older data if isinstance(predicate, BoundIsNull): return AlwaysTrue() + # Projected fields are only available for identity partition fields + # Which mean that partition pruning excluded partition field which evaluates to false + elif field_name in self.projected_missing_fields: + return AlwaysTrue() Review Comment: On the second look such an approach could lead to incorrect results in case of some complex predicates. For example `(P = x AND F = a) OR ( P = y AND F = b)` by substituting the term `P = ...` we would get an incorrect predicate `F = a OR F = b`. The correct approach here should be substituting `P` with concrete value extracted from partition. Not sure how to implement this feature. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org