yashmayya commented on code in PR #14972:
URL: https://github.com/apache/pinot/pull/14972#discussion_r1971010755


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/join/LookupTable.java:
##########
@@ -0,0 +1,100 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.operator.join;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import javax.annotation.Nullable;
+
+
+public abstract class LookupTable {
+  // TODO: Make it configurable
+  protected static final int INITIAL_CAPACITY = 10000;
+
+  protected boolean _keysUnique = true;
+
+  /**
+   * Adds a row to the lookup table.
+   */
+  public abstract void addRow(Object key, Object[] row);
+
+  @SuppressWarnings("unchecked")
+  protected Object calculateValue(Object[] row, @Nullable Object currentValue) 
{

Review Comment:
   nit: `computeNewValue` might be a more descriptive name for this method?



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/HashJoinOperator.java:
##########
@@ -123,69 +145,99 @@ protected List<Object[]> 
buildJoinedRows(TransferableBlock leftBlock)
       case ANTI:
         return buildJoinedDataBlockAnti(leftBlock);
       default: { // INNER, LEFT, RIGHT, FULL
-        return buildJoinedDataBlockDefault(leftBlock);
+        if (_rightTable.isKeysUnique()) {

Review Comment:
   I've manually validated the changes here and the logic looks good but I 
think it would be nice to have more unit tests covering this at a more granular 
level.



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/join/FloatLookupTable.java:
##########
@@ -0,0 +1,65 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.operator.join;
+
+import it.unimi.dsi.fastutil.floats.Float2ObjectMap;
+import it.unimi.dsi.fastutil.floats.Float2ObjectOpenHashMap;
+import java.util.Map;
+import java.util.Set;
+import javax.annotation.Nullable;
+
+
+/**
+ * The {@code FloatLookupTable} is a lookup table for float keys.
+ */
+@SuppressWarnings("unchecked")
+public class FloatLookupTable extends LookupTable {
+  private final Float2ObjectOpenHashMap<Object> _lookupTable = new 
Float2ObjectOpenHashMap<>(INITIAL_CAPACITY);
+
+  @Override
+  public void addRow(Object key, Object[] row) {
+    _lookupTable.compute((float) key, (k, v) -> calculateValue(row, v));
+  }
+
+  @Override
+  public void finish() {
+    if (!_keysUnique) {
+      for (Float2ObjectMap.Entry<Object> entry : 
_lookupTable.float2ObjectEntrySet()) {
+        convertValueToList(entry);
+      }
+    }
+  }
+
+  @Override
+  public boolean containsKey(Object key) {
+    return _lookupTable.containsKey((float) key);
+  }
+
+  @Nullable
+  @Override
+  public Object[] lookup(Object key) {
+    return (Object[]) _lookupTable.get((float) key);

Review Comment:
   Same question as above. Looks like the `INT` / `LONG` / `OBJECT` versions of 
the lookup table don't have this issue. Am I missing something here? Even if 
the key is a `FLOAT` / `DOUBLE` type there could be more than one row having 
the same key right?



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/join/LookupTable.java:
##########
@@ -0,0 +1,100 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.operator.join;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import javax.annotation.Nullable;
+
+
+public abstract class LookupTable {
+  // TODO: Make it configurable
+  protected static final int INITIAL_CAPACITY = 10000;
+
+  protected boolean _keysUnique = true;
+
+  /**
+   * Adds a row to the lookup table.
+   */
+  public abstract void addRow(Object key, Object[] row);
+
+  @SuppressWarnings("unchecked")
+  protected Object calculateValue(Object[] row, @Nullable Object currentValue) 
{
+    if (currentValue == null) {
+      return row;
+    } else {
+      _keysUnique = false;
+      if (currentValue instanceof List) {
+        ((List<Object[]>) currentValue).add(row);
+        return currentValue;
+      } else {
+        List<Object[]> rows = new ArrayList<>();
+        rows.add((Object[]) currentValue);
+        rows.add(row);
+        return rows;
+      }
+    }
+  }
+
+  /**
+   * Finishes adding rows to the lookup table. This method should be called 
after all rows are added to the lookup
+   * table, and before looking up rows.
+   */
+  public abstract void finish();
+
+  protected static void convertValueToList(Map.Entry<?, Object> entry) {
+    Object value = entry.getValue();
+    if (value instanceof Object[]) {
+      entry.setValue(Collections.singletonList(value));
+    }
+  }
+
+  /**
+   * Returns {@code true} when all the keys added to the lookup table are 
unique.
+   * When all keys are unique, the value of the lookup table is a single row 
({@code Object[]}). When keys are not
+   * unique, the value of the lookup table is a list of rows ({@code 
List<Object[]>}).
+   */
+  public boolean isKeysUnique() {

Review Comment:
   ```suggestion
     public boolean areKeysUnique() {
   ```
   nit



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/HashJoinOperator.java:
##########
@@ -123,69 +145,99 @@ protected List<Object[]> 
buildJoinedRows(TransferableBlock leftBlock)
       case ANTI:
         return buildJoinedDataBlockAnti(leftBlock);
       default: { // INNER, LEFT, RIGHT, FULL
-        return buildJoinedDataBlockDefault(leftBlock);
+        if (_rightTable.isKeysUnique()) {

Review Comment:
   Actually on second thought, I think the tests in `HashJoinOperatorTest` are 
already covering many of these scenarios. 



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/join/DoubleLookupTable.java:
##########
@@ -0,0 +1,65 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.operator.join;
+
+import it.unimi.dsi.fastutil.doubles.Double2ObjectMap;
+import it.unimi.dsi.fastutil.doubles.Double2ObjectOpenHashMap;
+import java.util.Map;
+import java.util.Set;
+import javax.annotation.Nullable;
+
+
+/**
+ * The {@code DoubleLookupTable} is a lookup table for double keys.
+ */
+@SuppressWarnings("unchecked")
+public class DoubleLookupTable extends LookupTable {
+  private final Double2ObjectOpenHashMap<Object> _lookupTable = new 
Double2ObjectOpenHashMap<>(INITIAL_CAPACITY);
+
+  @Override
+  public void addRow(Object key, Object[] row) {
+    _lookupTable.compute((double) key, (k, v) -> calculateValue(row, v));
+  }
+
+  @Override
+  public void finish() {
+    if (!_keysUnique) {
+      for (Double2ObjectMap.Entry<Object> entry : 
_lookupTable.double2ObjectEntrySet()) {
+        convertValueToList(entry);
+      }
+    }
+  }
+
+  @Override
+  public boolean containsKey(Object key) {
+    return _lookupTable.containsKey((double) key);
+  }
+
+  @Nullable
+  @Override
+  public Object[] lookup(Object key) {
+    return (Object[]) _lookupTable.get((double) key);

Review Comment:
   Why are we casting to `Object[]` and returning `Object[]` here? The value 
could be a `List<Object[]>` if `_keysUnique` is `false` right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to