rdblue opened a new pull request, #12311: URL: https://github.com/apache/iceberg/pull/12311
This updates `InclusiveMetricsEvaluator` that uses column stats to skip data files during scan planning. The evaluator was implementing the older `BoundExpressionVisitor` interface that only supported `BoundReference` and not other `BoundTerm` instances like `BoundTransform`. After #12304, `BoundExtract` also needs to be supported. Filtering works for transformed values when the transform is order preserving. If it is not order preserving (like `bucket`) the bounds cannot be used. Most of the changes to `BoundExpressionVisitor` are to produce the lower and upper bounds values that are tested: * For `BoundReference`, deserialize the bound from the correct `lowerBounds` or `upperBounds` map (moved to `parseLowerBound` and `parseLowerBound` * For `BoundTransform`, deserialize the bound and transform it if the transform is order-preserving (in `transformLowerBound` and `transformUpperBound`) * For `Extract`, deserialize the bound as a map from field name to `VariantValue`, then convert the value to the internal representation (in `extractLowerBound` and `extractUpperBound`) Note that this implementation **temporarily** uses Java serialization to convert the Variant field bounds to bytes. Once we have defined a serialization for these values, this should use it instead. Serialization and parsing is handled by `VariantDataUtil`. This adds new test suites for the new `BoundTerm` cases that are supported: * `TestInclusiveMetricsEvaluatorWithExtract` tests variant cases * `TestInclusiveMetricsEvaluatorWithTransforms` tests transform cases -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org