hantangwangd opened a new pull request, #16697:
URL: https://github.com/apache/iceberg/pull/16697

   When scanning table records via `IcebergGenerics.read(table)` and specifying 
filter conditions with `where(filter)`, if the filter contains an `IN` 
predicate and the corresponding target column contains null values, the query 
may fail directly with the following error:
   
   ```
   java.lang.NullPointerException: Invalid object: null
   ```
   
   The root cause is: when `FilterIterator.advance()` is called, it invokes the 
`shouldKeep(item)` closure method of `CloseableIterable` to determine whether 
to keep the read item, during which the `in(...)` method of `EvalVisitor` is 
executed for evaluation. In the original logic, it directly checks that the 
corresponding target column value is not null, and throws immediately if it is 
null.
   
   However, in many scenarios (such as the one constructed in the newly added 
test case), when a data file contains both possible valid values and null 
values in the target column, the records that contain null values will be read 
and passed to this method for evaluation, at which point an error will be 
thrown directly.
   
   This PR fixes the issue by properly handling null values.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to