mihaibudiu commented on code in PR #4678:
URL: https://github.com/apache/calcite/pull/4678#discussion_r2609496880


##########
core/src/test/resources/sql/planner.iq:
##########
@@ -215,14 +215,13 @@ select a from (values (1.0), (4.0), (null)) as t3 (a);
 !ok
 
 EnumerableAggregate(group=[{0}])
-  EnumerableNestedLoopJoin(condition=[IS NOT DISTINCT FROM($0, $1)], 
joinType=[semi])
+  EnumerableHashJoin(condition=[IS NOT DISTINCT FROM($0, $1)], joinType=[semi])
     EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], 
A=[$t1])
-      EnumerableAggregate(group=[{0}])
-        EnumerableNestedLoopJoin(condition=[IS NOT DISTINCT FROM($0, $1)], 
joinType=[semi])
-          EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) 
NOT NULL], A=[$t1])
-            EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, { 
5.0 }]])
-          EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) 
NOT NULL], A=[$t1])
-            EnumerableValues(tuples=[[{ 1 }, { 2 }]])
+      EnumerableHashJoin(condition=[IS NOT DISTINCT FROM($0, $1)], 
joinType=[semi])

Review Comment:
   do you know why the distinct is no longer necessary?



##########
core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableHashJoin.java:
##########
@@ -218,8 +218,10 @@ private Result 
implementHashSemiJoin(EnumerableRelImplementor implementor, Prefe
                 Expressions.list(
                     leftExpression,
                     rightExpression,
-                    
leftResult.physType.generateAccessorWithoutNulls(joinInfo.leftKeys),
-                    
rightResult.physType.generateAccessorWithoutNulls(joinInfo.rightKeys),
+                    leftResult.physType.generateNullAwareAccessor(

Review Comment:
   There are some tests in TpcdsTest which are marked
   ```
   @Disabled("throws 'RuntimeException: Cannot convert null to long'")
   ```
   which I think boils down to a problem with the HashJoin. Can you check to 
see whether these tests can be enabled?
   They are marked `@slow`, so you may need to comment-out temporarily that 
annotation to be able to run them.
   (I have no idea when the slow tests are run, if ever)



##########
core/src/main/java/org/apache/calcite/rel/core/JoinInfo.java:
##########
@@ -29,33 +29,37 @@
 import com.google.common.collect.ImmutableList;
 
 import java.util.ArrayList;
+import java.util.Collections;
 import java.util.List;
 
 import static java.util.Objects.requireNonNull;
 
 /** An analyzed join condition.
  *
  * <p>It is useful for the many algorithms that care whether a join has an
- * equi-join condition.
+ * equi-join condition (contains EQUALS and IS NOT DISTINCT FROM).
  *
- * <p>You can create one using {@link #createWithStrictEquality}, or call
+ * <p>You can create one using {@link #of(RelNode, RelNode, RexNode)},
+ * {@link #createWithStrictEquality}, or call
  * {@link Join#analyzeCondition()}; many kinds of join cache their
  * join info, especially those that are equi-joins.
  *
  * @see Join#analyzeCondition() */
 public class JoinInfo {
   public final ImmutableIntList leftKeys;
   public final ImmutableIntList rightKeys;
+  public final ImmutableList<Boolean> filterNulls;

Review Comment:
   Can you please add JavaDoc for this field and perhaps the next one too?



##########
core/src/main/java/org/apache/calcite/adapter/enumerable/PhysType.java:
##########
@@ -134,6 +134,13 @@ Expression fieldReference(Expression expression, int field,
    */
   Expression generateAccessorWithoutNulls(List<Integer> fields);
 
+  /**
+   * Similar to {@link #generateAccessor(List)} and {@link 
#generateAccessorWithoutNulls(List)},
+   * but it's null-aware. If one of the fields is <code>null</code> and this 
field is marked as
+   * TRUE in filterNulls, it will return <code>null</code>.

Review Comment:
   You mean, it will return an expression that evaluates to null?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to