hantangwangd opened a new pull request, #16697: URL: https://github.com/apache/iceberg/pull/16697
When scanning table records via `IcebergGenerics.read(table)` and specifying filter conditions with `where(filter)`, if the filter contains an `IN` predicate and the corresponding target column contains null values, the query may fail directly with the following error: ``` java.lang.NullPointerException: Invalid object: null ``` The root cause is: when `FilterIterator.advance()` is called, it invokes the `shouldKeep(item)` closure method of `CloseableIterable` to determine whether to keep the read item, during which the `in(...)` method of `EvalVisitor` is executed for evaluation. In the original logic, it directly checks that the corresponding target column value is not null, and throws immediately if it is null. However, in many scenarios (such as the one constructed in the newly added test case), when a data file contains both possible valid values and null values in the target column, the records that contain null values will be read and passed to this method for evaluation, at which point an error will be thrown directly. This PR fixes the issue by properly handling null values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
