aokolnychyi commented on code in PR #8446:
URL: https://github.com/apache/iceberg/pull/8446#discussion_r1330656200
##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkFilters.java:
##########
@@ -161,10 +162,13 @@ public static Expression convert(Filter filter) {
case IN:
In inFilter = (In) filter;
+ if (Stream.of(inFilter.values()).anyMatch(Objects::isNull)) {
Review Comment:
I thought we handled such cases in a special way. For instance, there is
`hasNoInFilter` used in the negation to recursively check for nested NOT IN
inside NOT. We do handle IN and NOT IN differently, there are separate branches
for them with specific null handling.
My worry is that filters like `IN (1, 2, NULL)`, which are perfectly fine to
push down, will no longer be pushed down, causing silent performance issues. It
is unlikely someone explicitly passes NULL inside IN but such predicates can be
generated programatically.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]