mayankshriv commented on a change in pull request #5625:
URL: https://github.com/apache/incubator-pinot/pull/5625#discussion_r446677559



##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/common/ColumnValueReader.java
##########
@@ -0,0 +1,248 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.common;
+
+import org.apache.pinot.core.io.reader.ForwardIndexReader;
+import org.apache.pinot.core.io.reader.ReaderContext;
+import org.apache.pinot.spi.data.FieldSpec.DataType;
+
+
+/**
+ * ColumnValueReader is a wrapper class on top of the forward index reader to 
read values from it.
+ * <p>Batch value read APIs support the following data type conversions for 
numeric types:
+ * <ul>
+ *   <li>INT -> LONG</li>
+ *   <li>INT, LONG -> FLOAT</li>
+ *   <li>INT, LONG, FLOAT -> DOUBLE</li>
+ * </ul>
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public final class ColumnValueReader {
+  private final ForwardIndexReader _forwardIndexReader;

Review comment:
       Should this extend forwardIndexReader, as it seems to be pass-through 
for several api's, and providing additional functionality?

##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/common/ColumnValueReader.java
##########
@@ -0,0 +1,248 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.common;
+
+import org.apache.pinot.core.io.reader.ForwardIndexReader;
+import org.apache.pinot.core.io.reader.ReaderContext;
+import org.apache.pinot.spi.data.FieldSpec.DataType;
+
+
+/**
+ * ColumnValueReader is a wrapper class on top of the forward index reader to 
read values from it.
+ * <p>Batch value read APIs support the following data type conversions for 
numeric types:
+ * <ul>
+ *   <li>INT -> LONG</li>
+ *   <li>INT, LONG -> FLOAT</li>
+ *   <li>INT, LONG, FLOAT -> DOUBLE</li>
+ * </ul>
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public final class ColumnValueReader {
+  private final ForwardIndexReader _forwardIndexReader;
+  private final DataType _valueType;
+  private final ReaderContext _readerContext;
+
+  public ColumnValueReader(ForwardIndexReader forwardIndexReader) {
+    _forwardIndexReader = forwardIndexReader;
+    _valueType = forwardIndexReader.getValueType();
+    // TODO: Figure out a way to close the reader context. Currently it can 
cause direct memory leak.

Review comment:
       This should not be a TODO, but fixed in this PR itself? Or are you 
saying it happens in some corner case? If so, please describe when that might 
happen.

##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/io/readerwriter/ForwardIndexReaderWriter.java
##########
@@ -16,11 +16,29 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-package org.apache.pinot.core.io.reader.impl;
+package org.apache.pinot.core.io.readerwriter;
 
+import org.apache.pinot.core.io.reader.ForwardIndexReader;
 import org.apache.pinot.core.io.reader.ReaderContext;
+import org.apache.pinot.core.io.writer.ForwardIndexWriter;
 
 
-public class UnSortedValueReaderContext implements ReaderContext {
+/**
+ * Interface for forward index reader-writer.
+ */
+public interface ForwardIndexReaderWriter extends 
ForwardIndexReader<ReaderContext>, ForwardIndexWriter {

Review comment:
       I am confused by this change:
   * Why is this diff showing as move of an unrelated class?
   * The name of the interface is `ForwardIndexReaderWriter`, but it has no 
read or write api's?
   

##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/common/ColumnValueReader.java
##########
@@ -0,0 +1,248 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.common;
+
+import org.apache.pinot.core.io.reader.ForwardIndexReader;
+import org.apache.pinot.core.io.reader.ReaderContext;
+import org.apache.pinot.spi.data.FieldSpec.DataType;
+
+
+/**
+ * ColumnValueReader is a wrapper class on top of the forward index reader to 
read values from it.
+ * <p>Batch value read APIs support the following data type conversions for 
numeric types:
+ * <ul>
+ *   <li>INT -> LONG</li>
+ *   <li>INT, LONG -> FLOAT</li>
+ *   <li>INT, LONG, FLOAT -> DOUBLE</li>
+ * </ul>
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public final class ColumnValueReader {
+  private final ForwardIndexReader _forwardIndexReader;
+  private final DataType _valueType;
+  private final ReaderContext _readerContext;
+
+  public ColumnValueReader(ForwardIndexReader forwardIndexReader) {
+    _forwardIndexReader = forwardIndexReader;
+    _valueType = forwardIndexReader.getValueType();
+    // TODO: Figure out a way to close the reader context. Currently it can 
cause direct memory leak.
+    _readerContext = forwardIndexReader.createContext();
+  }
+
+  /**
+   * Returns the data type of the values in the value reader.
+   * <p>NOTE: Dictionary id is handled as INT type.
+   */
+  public DataType getValueType() {
+    return _valueType;
+  }
+
+  /**
+   * Returns {@code true} if the value reader is for a single-value column, 
{@code false} otherwise.
+   */
+  public boolean isSingleValue() {
+    return _forwardIndexReader.isSingleValue();
+  }
+
+  /**
+   * NOTE: The following single value read APIs do not handle the data type 
conversion for performance concern. Caller
+   *       should always call the API that matches the value type.
+   */
+
+  /**
+   * Returns the INT type single-value at the given document id.
+   * <p>NOTE: Dictionary id is handled as INT type.
+   */
+  public int getIntValue(int docId) {
+    return _forwardIndexReader.getInt(docId, _readerContext);
+  }
+
+  /**
+   * Returns the LONG type single-value at the given document id.
+   */
+  public long getLongValue(int docId) {
+    return _forwardIndexReader.getLong(docId, _readerContext);
+  }
+
+  /**
+   * Returns the FLOAT type single-value at the given document id.
+   */
+  public float getFloatValue(int docId) {
+    return _forwardIndexReader.getFloat(docId, _readerContext);
+  }
+
+  /**
+   * Returns the DOUBLE type single-value at the given document id.
+   */
+  public double getDoubleValue(int docId) {
+    return _forwardIndexReader.getDouble(docId, _readerContext);
+  }
+
+  /**
+   * Returns the STRING type single-value at the given document id.
+   */
+  public String getStringValue(int docId) {
+    return _forwardIndexReader.getString(docId, _readerContext);
+  }
+
+  /**
+   * Returns the BYTES type single-value at the given document id.
+   */
+  public byte[] getBytesValue(int docId) {
+    return _forwardIndexReader.getBytes(docId, _readerContext);
+  }
+
+  /**
+   * Reads the INT type multi-value at the given document id into the value 
buffer and returns the number of values in
+   * the multi-value entry.
+   * <p>The passed in value buffer should be large enough to hold all the 
values of a multi-value entry.
+   * <p>NOTE: Dictionary id is handled as INT type.
+   */
+  public int getIntValues(int docId, int[] valueBuffer) {
+    return _forwardIndexReader.getIntArray(docId, valueBuffer, _readerContext);
+  }
+
+  // TODO: Support raw index for multi-value columns
+
+  /**
+   * NOTE: The following batch value read APIs support data type conversions 
for numeric types. Caller can call any
+   *       API regardless of the value type.
+   * TODO: Consider letting the caller handle the data type conversion because 
for different use cases, we might need to
+   *       convert data type differently.
+   */
+
+  /**
+   * Batch reads the INT type single-values at the given document ids of the 
given length into the value buffer.
+   * <p>The passed in value buffer size should be larger than or equal to the 
length.
+   * <p>NOTE: Dictionary id is handled as INT type.
+   */
+  public void getIntValues(int[] docIds, int length, int[] valueBuffer) {
+    if (_valueType == DataType.INT) {
+      _forwardIndexReader.readValues(docIds, length, valueBuffer, 
_readerContext);
+    } else {
+      throw new IllegalStateException(String.format("Cannot read %s as INT", 
_valueType));
+    }
+  }
+
+  /**
+   * Batch reads the LONG type single-values at the given document ids of the 
given length into the value buffer.
+   * <p>The passed in value buffer size should be larger than or equal to the 
length.
+   */
+  public void getLongValues(int[] docIds, int length, long[] valueBuffer) {
+    switch (_valueType) {
+      case INT:

Review comment:
       Does it make sense to read strings as numbers (of course, assuming that 
these are numbers stored as strings)? Or that use case does not make sense?

##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/io/readerwriter/impl/FixedByteMVForwardIndexReaderWriter.java
##########
@@ -221,102 +222,63 @@ private int updateHeader(int row, int numValues) {
   }
 
   @Override
-  public void setCharArray(int row, char[] charArray) {
-    int newStartIndex = updateHeader(row, charArray.length);
-    for (int i = 0; i < charArray.length; i++) {
-      _currentDataWriter.setChar(newStartIndex + i, 0, charArray[i]);
-    }
+  public DataType getValueType() {
+    // NOTE: Dictionary id is handled as INT type.
+    // TODO: Currently we only support dictionary-encoded forward index on 
multi-value columns.
+    return DataType.INT;
   }
 
   @Override
-  public void setShortArray(int row, short[] shortsArray) {
+  public boolean isSingleValue() {
+    return false;
+  }
 
-    int newStartIndex = updateHeader(row, shortsArray.length);
-    for (int i = 0; i < shortsArray.length; i++) {
-      _currentDataWriter.setShort(newStartIndex + i, 0, shortsArray[i]);
-    }
+  @Override
+  public int getLengthOfShortestElement() {
+    return _columnSizeInBytes;
+  }
+
+  @Override
+  public int getLengthOfLongestElement() {
+    return _columnSizeInBytes;
   }
 
   @Override
-  public void setIntArray(int row, int[] intArray) {
-    int newStartIndex = updateHeader(row, intArray.length);
+  public void setIntArray(int docId, int[] intArray) {

Review comment:
       Is the caller still expected to call these api's with incremental docId? 
If so, perhaps get rid of the docId arg, and write at current cursor position?

##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/io/writer/ForwardIndexWriter.java
##########
@@ -0,0 +1,148 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.io.writer;
+
+import java.io.Closeable;
+
+
+/**
+ * Interface for forward index writer.
+ */
+public interface ForwardIndexWriter extends Closeable {
+
+  /**
+   * SINGLE-VALUE COLUMN APIs
+   */
+
+  /**
+   * Writes the INT type single-value into the given document id.
+   * <p>NOTE: Dictionary id is handled as INT type.
+   *
+   * @param docId Document id
+   * @param value Value to write
+   */
+  default void setInt(int docId, int value) {

Review comment:
       If this is a new interface, why provide `default` here (as opposed to an 
abstract base class)?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to