yashmayya commented on code in PR #14972: URL: https://github.com/apache/pinot/pull/14972#discussion_r1971010755
########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/join/LookupTable.java: ########## @@ -0,0 +1,100 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.pinot.query.runtime.operator.join; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Set; +import javax.annotation.Nullable; + + +public abstract class LookupTable { + // TODO: Make it configurable + protected static final int INITIAL_CAPACITY = 10000; + + protected boolean _keysUnique = true; + + /** + * Adds a row to the lookup table. + */ + public abstract void addRow(Object key, Object[] row); + + @SuppressWarnings("unchecked") + protected Object calculateValue(Object[] row, @Nullable Object currentValue) { Review Comment: nit: `computeNewValue` might be a more descriptive name for this method? ########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/HashJoinOperator.java: ########## @@ -123,69 +145,99 @@ protected List<Object[]> buildJoinedRows(TransferableBlock leftBlock) case ANTI: return buildJoinedDataBlockAnti(leftBlock); default: { // INNER, LEFT, RIGHT, FULL - return buildJoinedDataBlockDefault(leftBlock); + if (_rightTable.isKeysUnique()) { Review Comment: I've manually validated the changes here and the logic looks good but I think it would be nice to have more unit tests covering this at a more granular level. ########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/join/FloatLookupTable.java: ########## @@ -0,0 +1,65 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.pinot.query.runtime.operator.join; + +import it.unimi.dsi.fastutil.floats.Float2ObjectMap; +import it.unimi.dsi.fastutil.floats.Float2ObjectOpenHashMap; +import java.util.Map; +import java.util.Set; +import javax.annotation.Nullable; + + +/** + * The {@code FloatLookupTable} is a lookup table for float keys. + */ +@SuppressWarnings("unchecked") +public class FloatLookupTable extends LookupTable { + private final Float2ObjectOpenHashMap<Object> _lookupTable = new Float2ObjectOpenHashMap<>(INITIAL_CAPACITY); + + @Override + public void addRow(Object key, Object[] row) { + _lookupTable.compute((float) key, (k, v) -> calculateValue(row, v)); + } + + @Override + public void finish() { + if (!_keysUnique) { + for (Float2ObjectMap.Entry<Object> entry : _lookupTable.float2ObjectEntrySet()) { + convertValueToList(entry); + } + } + } + + @Override + public boolean containsKey(Object key) { + return _lookupTable.containsKey((float) key); + } + + @Nullable + @Override + public Object[] lookup(Object key) { + return (Object[]) _lookupTable.get((float) key); Review Comment: Same question as above. Looks like the `INT` / `LONG` / `OBJECT` versions of the lookup table don't have this issue. Am I missing something here? Even if the key is a `FLOAT` / `DOUBLE` type there could be more than one row having the same key right? ########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/join/LookupTable.java: ########## @@ -0,0 +1,100 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.pinot.query.runtime.operator.join; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Set; +import javax.annotation.Nullable; + + +public abstract class LookupTable { + // TODO: Make it configurable + protected static final int INITIAL_CAPACITY = 10000; + + protected boolean _keysUnique = true; + + /** + * Adds a row to the lookup table. + */ + public abstract void addRow(Object key, Object[] row); + + @SuppressWarnings("unchecked") + protected Object calculateValue(Object[] row, @Nullable Object currentValue) { + if (currentValue == null) { + return row; + } else { + _keysUnique = false; + if (currentValue instanceof List) { + ((List<Object[]>) currentValue).add(row); + return currentValue; + } else { + List<Object[]> rows = new ArrayList<>(); + rows.add((Object[]) currentValue); + rows.add(row); + return rows; + } + } + } + + /** + * Finishes adding rows to the lookup table. This method should be called after all rows are added to the lookup + * table, and before looking up rows. + */ + public abstract void finish(); + + protected static void convertValueToList(Map.Entry<?, Object> entry) { + Object value = entry.getValue(); + if (value instanceof Object[]) { + entry.setValue(Collections.singletonList(value)); + } + } + + /** + * Returns {@code true} when all the keys added to the lookup table are unique. + * When all keys are unique, the value of the lookup table is a single row ({@code Object[]}). When keys are not + * unique, the value of the lookup table is a list of rows ({@code List<Object[]>}). + */ + public boolean isKeysUnique() { Review Comment: ```suggestion public boolean areKeysUnique() { ``` nit ########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/HashJoinOperator.java: ########## @@ -123,69 +145,99 @@ protected List<Object[]> buildJoinedRows(TransferableBlock leftBlock) case ANTI: return buildJoinedDataBlockAnti(leftBlock); default: { // INNER, LEFT, RIGHT, FULL - return buildJoinedDataBlockDefault(leftBlock); + if (_rightTable.isKeysUnique()) { Review Comment: Actually on second thought, I think the tests in `HashJoinOperatorTest` are already covering many of these scenarios. ########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/join/DoubleLookupTable.java: ########## @@ -0,0 +1,65 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.pinot.query.runtime.operator.join; + +import it.unimi.dsi.fastutil.doubles.Double2ObjectMap; +import it.unimi.dsi.fastutil.doubles.Double2ObjectOpenHashMap; +import java.util.Map; +import java.util.Set; +import javax.annotation.Nullable; + + +/** + * The {@code DoubleLookupTable} is a lookup table for double keys. + */ +@SuppressWarnings("unchecked") +public class DoubleLookupTable extends LookupTable { + private final Double2ObjectOpenHashMap<Object> _lookupTable = new Double2ObjectOpenHashMap<>(INITIAL_CAPACITY); + + @Override + public void addRow(Object key, Object[] row) { + _lookupTable.compute((double) key, (k, v) -> calculateValue(row, v)); + } + + @Override + public void finish() { + if (!_keysUnique) { + for (Double2ObjectMap.Entry<Object> entry : _lookupTable.double2ObjectEntrySet()) { + convertValueToList(entry); + } + } + } + + @Override + public boolean containsKey(Object key) { + return _lookupTable.containsKey((double) key); + } + + @Nullable + @Override + public Object[] lookup(Object key) { + return (Object[]) _lookupTable.get((double) key); Review Comment: Why are we casting to `Object[]` and returning `Object[]` here? The value could be a `List<Object[]>` if `_keysUnique` is `false` right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org