rdblue opened a new pull request, #12304: URL: https://github.com/apache/iceberg/pull/12304
This adds `Expressions.extract` to extract a value from a variant in Iceberg filters. The new method, `Expressions.extract(column, path, type)`, accepts a column name, a JSON path, and a type. `UnboundExtract` is responsible for binding to `BoundExtract`. Binding the `extract` term validates that the referenced field is a variant, that the path is valid and supported, and that the type is valid. Binding is tested in `TestExpressionBinding`. The new `extract` expression required extending `BoundTerm` with a new method, `producesNull`, to detect when `isNull` or `notNull` are determined by the expression. In addition, this PR adds support to handle `unknown` in binding. The supported JSON path expressions are very limited. All paths must start with the root (`$`) and consist of only simple property selection using `.name`. Using JSON path allows later extension to use quoted name in brackets for field access (and more selection features), but avoids needing to add more complex cases now. This PR also moves the public Variant interfaces to API and adding some utilities to work with those classes: * `VariantData` is an implementation of the top-level `Variant` interface, exposed by the factory method `Variant.of(VariantMetadata, VariantValue)` * `VariantDataUtil` has methods for working with variants; this introduces `parsePath` to validate path expressions that are passed to `extract` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org