yashmayya commented on code in PR #13981:
URL: https://github.com/apache/pinot/pull/13981#discussion_r1756163553


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/HashJoinOperator.java:
##########
@@ -389,18 +369,29 @@ private List<Object[]> 
buildJoinedDataBlockDefault(TransferableBlock leftBlock)
     return rows;
   }
 
-  private List<Object[]> buildJoinedDataBlockAnti(TransferableBlock leftBlock)
-      throws ProcessingException {
+  private List<Object[]> buildJoinedDataBlockSemi(TransferableBlock leftBlock) 
{
+    List<Object[]> container = leftBlock.getContainer();
+    List<Object[]> rows = new ArrayList<>(container.size());
+
+    for (Object[] leftRow : container) {
+      Object key = _leftKeySelector.getKey(leftRow);
+      // SEMI-JOIN only checks existence of the key
+      if (_broadcastRightTable.containsKey(key)) {
+        rows.add(joinRow(leftRow, null));
+      }
+    }
+
+    return rows;
+  }
+
+  private List<Object[]> buildJoinedDataBlockAnti(TransferableBlock leftBlock) 
{
     List<Object[]> container = leftBlock.getContainer();
     List<Object[]> rows = new ArrayList<>(container.size());
 
     for (Object[] leftRow : container) {
       Object key = _leftKeySelector.getKey(leftRow);
       // ANTI-JOIN only checks non-existence of the key
       if (!_broadcastRightTable.containsKey(key)) {
-        if (incrementJoinedRowsAndCheckLimit()) {
-          break;
-        }

Review Comment:
   Is it because we want this protection mainly for cross joins and other 
similar join conditions where the number of joined rows can be much more than 
the sum of individual rows from the left and right blocks?



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/HashJoinOperator.java:
##########
@@ -389,18 +369,29 @@ private List<Object[]> 
buildJoinedDataBlockDefault(TransferableBlock leftBlock)
     return rows;
   }
 
-  private List<Object[]> buildJoinedDataBlockAnti(TransferableBlock leftBlock)
-      throws ProcessingException {
+  private List<Object[]> buildJoinedDataBlockSemi(TransferableBlock leftBlock) 
{
+    List<Object[]> container = leftBlock.getContainer();
+    List<Object[]> rows = new ArrayList<>(container.size());
+
+    for (Object[] leftRow : container) {
+      Object key = _leftKeySelector.getKey(leftRow);
+      // SEMI-JOIN only checks existence of the key
+      if (_broadcastRightTable.containsKey(key)) {
+        rows.add(joinRow(leftRow, null));
+      }
+    }
+
+    return rows;
+  }
+
+  private List<Object[]> buildJoinedDataBlockAnti(TransferableBlock leftBlock) 
{
     List<Object[]> container = leftBlock.getContainer();
     List<Object[]> rows = new ArrayList<>(container.size());
 
     for (Object[] leftRow : container) {
       Object key = _leftKeySelector.getKey(leftRow);
       // ANTI-JOIN only checks non-existence of the key
       if (!_broadcastRightTable.containsKey(key)) {
-        if (incrementJoinedRowsAndCheckLimit()) {
-          break;
-        }

Review Comment:
   Why don't we need the rows limit check for semi and anti joins?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to