Re: [PR] [DRAFT]Adding support for RawValueBasedInvertedIndex and custom metadata upl… [pinot]

via GitHub Tue, 21 Oct 2025 17:26:59 -0700


Copilot commented on code in PR #17032:
URL: https://github.com/apache/pinot/pull/17032#discussion_r2450036106



##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/memory/EmptyIndexBuffer.java:
##########
@@ -0,0 +1,304 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.memory;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.Properties;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+
+
+/**
+ * A specialized PinotDataBuffer implementation for zero-size index entries 
that contains S3 segment path information.
+ * This buffer is useful for debugging and tracking purposes when dealing with 
empty index entries.
+ */
+public class EmptyIndexBuffer extends PinotDataBuffer {
+  private final Properties _properties;
+  private final String _segmentName;
+  private final String _tableNameWithType;
+  private final String _segmentPath;
+
+  /**
+   * Creates a new S3EmptyIndexBuffer for a zero-size index entry

Review Comment:
   Corrected class name reference in Javadoc from 'S3EmptyIndexBuffer' to 
'EmptyIndexBuffer'.
   ```suggestion
      * Creates a new EmptyIndexBuffer for a zero-size index entry
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/SingleFileIndexDirectory.java:
##########
@@ -251,11 +263,29 @@ private void loadMap()
 
   private void mapBufferEntries()
       throws IOException {
+    // Phase 1: Prepare data structures - sort entries by start offset
+    // Use list to handle multiple entries with same start offset
+    List<IndexEntry> sortedEntries = _columnEntries.values().stream()
+        .sorted((e1, e2) -> Long.compare(e1._startOffset, e2._startOffset))
+        .collect(java.util.stream.Collectors.toList());

Review Comment:
   The comment mentions handling "multiple entries with same start offset", but 
the sorting comparator doesn't handle this case - entries with the same offset 
will have arbitrary ordering. If this is intentional, the comment should be 
clarified. Additionally, use the statically imported `Collectors.toList()` 
instead of the fully qualified name for consistency with project imports.



##########
pinot-tools/src/main/resources/examples/minions/batch/baseballStats/baseballStats_offline_table_config_raw_inverted_index.json:
##########
@@ -0,0 +1,49 @@
+{
+  "tableName": "baseballStats",
+  "tableType": "OFFLINE",
+  "segmentsConfig": {
+    "segmentPushType": "APPEND",
+    "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
+    "replication": "1"
+  },
+  "tenants": {
+  },
+  "tableIndexConfig": {
+    "loadMode": "HEAP",
+    "invertedIndexColumns": [
+      "teamID"

Review Comment:
   Missing comma between array elements "teamID" and "playerID" will cause JSON 
parsing to fail.
   ```suggestion
         "teamID",
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/SingleFileIndexDirectory.java:
##########
@@ -265,15 +295,35 @@ private void mapBufferEntries()
       runningSize += entry._size;
 
       if (runningSize >= MAX_ALLOCATION_SIZE && !offsetAccum.isEmpty()) {
-        mapAndSliceFile(indexStartMap, offsetAccum, offsetEntry.getKey());
+        // Calculate the correct end offset for the previous entries
+        long lastOffset = offsetAccum.get(offsetAccum.size() - 1);
+        IndexEntry lastEntry = indexStartMap.get(lastOffset);
+        long endOffset = lastOffset + lastEntry._size;
+        mapAndSliceFile(indexStartMap, offsetAccum, endOffset);
         runningSize = entry._size;
         offsetAccum.clear();
       }
       offsetAccum.add(offsetEntry.getKey());
     }
 
     if (!offsetAccum.isEmpty()) {
-      mapAndSliceFile(indexStartMap, offsetAccum, offsetAccum.get(0) + 
runningSize);
+      // Calculate the correct end offset: start of last entry + size of last 
entry
+      long lastOffset = offsetAccum.get(offsetAccum.size() - 1);
+      IndexEntry lastEntry = indexStartMap.get(lastOffset);
+      long endOffset = lastOffset + lastEntry._size;
+      mapAndSliceFile(indexStartMap, offsetAccum, endOffset);
+    }
+    // Handle zero-size entries with empty buffer
+    for (IndexEntry entry : sortedEntries) {
+      if (entry._size == 0) {
+        Properties properties = new Properties();
+        if (_segmentDirectoryLoaderContext != null
+            && _segmentDirectoryLoaderContext.getSegmentCustomConfigs() != 
null) {
+          
properties.putAll(_segmentDirectoryLoaderContext.getSegmentCustomConfigs());
+        }
+        entry._buffer = new EmptyIndexBuffer(properties, 
_segmentMetadata.getName(),
+            _segmentMetadata.getTableName());

Review Comment:
   EmptyIndexBuffer constructor expects (properties, segmentName, 
tableNameWithType) but receives (properties, name, tableName). The third 
parameter should be tableNameWithType (which includes the type suffix like 
"_OFFLINE" or "_REALTIME"), not just tableName. Use 
`_segmentMetadata.getTableName()` only if it already includes the type, 
otherwise this should be constructed or retrieved from table config.
   ```suggestion
               _segmentMetadata.getTableNameWithType());
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/invertedindex/InvertedIndexHandler.java:
##########
@@ -109,8 +110,8 @@ public void 
postUpdateIndicesCleanup(SegmentDirectory.Writer segmentWriter)
   }
 
   private boolean shouldCreateInvertedIndex(ColumnMetadata columnMetadata) {
-    // Only create inverted index on dictionary-encoded unsorted columns.
-    return columnMetadata != null && !columnMetadata.isSorted() && 
columnMetadata.hasDictionary();
+    // Only create inverted index on unsorted columns

Review Comment:
   This comment is incomplete and potentially misleading. The original comment 
clarified that inverted indexes are only created on "dictionary-encoded 
unsorted columns". The new comment removes the dictionary requirement but 
doesn't explain the change or the new behavior. It should explain that inverted 
indexes can now be created on both dictionary-encoded and raw-encoded unsorted 
columns.



##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/memory/EmptyIndexBuffer.java:
##########
@@ -0,0 +1,304 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.memory;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.Properties;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+
+
+/**
+ * A specialized PinotDataBuffer implementation for zero-size index entries 
that contains S3 segment path information.
+ * This buffer is useful for debugging and tracking purposes when dealing with 
empty index entries.
+ */
+public class EmptyIndexBuffer extends PinotDataBuffer {
+  private final Properties _properties;
+  private final String _segmentName;
+  private final String _tableNameWithType;
+  private final String _segmentPath;
+
+  /**
+   * Creates a new S3EmptyIndexBuffer for a zero-size index entry
+   *
+   * @param properties Properties containing S3 configuration (bucket, key, 
etc.)
+   * @param segmentName The name of the segment
+   * @param tableNameWithType The table name with type
+   */
+  public EmptyIndexBuffer(Properties properties, String segmentName, String 
tableNameWithType) {
+    super(false); // Not closeable since it's just metadata

Review Comment:
   The comment states "Not closeable since it's just metadata", but the class 
extends PinotDataBuffer which has release() and close operations. This comment 
is misleading as the buffer can be released/closed (even if they're no-ops), 
and the 'closeable' parameter refers to whether cleanup is needed, not whether 
methods can be called.
   ```suggestion
       super(false); // 'closeable' is false because no cleanup is needed; 
buffer can still be closed/released (no-op)
   ```



##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/creator/RawValueBasedInvertedIndexCreator.java:
##########
@@ -59,4 +59,13 @@ public interface RawValueBasedInvertedIndexCreator extends 
InvertedIndexCreator
    * For multi-value column, adds the double values for the next document.
    */
   void add(double[] values, int length);
+
+  default void add(String value) {
+  }
+  default void add(String[] value, int length) {
+  }
+  default void add(byte[] value) {
+  }

Review Comment:
   These default methods should include Javadoc comments explaining their 
purpose, parameters, and when they should be overridden. The empty 
implementations suggest they're optional, but documentation would clarify the 
expected behavior.
   ```suggestion
   
     /**
      * For single-value column, adds the String value for the next document.
      * <p>
      * Default implementation is a no-op. Override this method if the index 
supports String values.
      *
      * @param value The String value to add for the next document.
      */
     default void add(String value) {
     }
   
     /**
      * For multi-value column, adds the String values for the next document.
      * <p>
      * Default implementation is a no-op. Override this method if the index 
supports String values.
      *
      * @param value The array of String values to add for the next document.
      * @param length The number of values in the array.
      */
     default void add(String[] value, int length) {
     }
   
     /**
      * For single-value column, adds the byte[] value for the next document.
      * <p>
      * Default implementation is a no-op. Override this method if the index 
supports byte[] values.
      *
      * @param value The byte[] value to add for the next document.
      */
     default void add(byte[] value) {
     }
   
     /**
      * For multi-value column, adds the byte[][] values for the next document.
      * <p>
      * Default implementation is a no-op. Override this method if the index 
supports byte[][] values.
      *
      * @param value The array of byte[] values to add for the next document.
      * @param length The number of values in the array.
      */
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/invertedindex/InvertedIndexHandler.java:
##########
@@ -135,35 +136,86 @@ private void 
createInvertedIndexForColumn(SegmentDirectory.Writer segmentWriter,
     // Create new inverted index for the column.
     LOGGER.info("Creating new inverted index for segment: {}, column: {}", 
segmentName, columnName);
     int numDocs = columnMetadata.getTotalDocs();
-
     IndexCreationContext.Common context = IndexCreationContext.builder()
         .withIndexDir(indexDir)
         .withColumnMetadata(columnMetadata)
         .withTableNameWithType(_tableConfig.getTableName())
         .withContinueOnError(_tableConfig.getIngestionConfig() != null
             && _tableConfig.getIngestionConfig().isContinueOnError())
         .build();
-
-    try (DictionaryBasedInvertedIndexCreator creator = 
StandardIndexes.inverted()
-        .createIndexCreator(context, IndexConfig.ENABLED)) {
-      IndexReaderFactory<ForwardIndexReader> readerFactory = 
StandardIndexes.forward().getReaderFactory();
-      try (ForwardIndexReader forwardIndexReader = 
readerFactory.createIndexReader(segmentWriter,
-          _fieldIndexConfigs.get(columnMetadata.getColumnName()), 
columnMetadata);
+    if (columnMetadata.hasDictionary()) {
+      // Dictionary-based inverted index
+
+
+      try (DictionaryBasedInvertedIndexCreator creator = 
StandardIndexes.inverted()
+          .createIndexCreator(context, IndexConfig.ENABLED)) {
+
+        try (
+            ForwardIndexReader forwardIndexReader = StandardIndexes.forward()
+            .getReaderFactory()
+            .createIndexReader(segmentWriter, 
_fieldIndexConfigs.get(columnName), columnMetadata);
+            ForwardIndexReaderContext readerContext = 
forwardIndexReader.createContext()) {
+          if (columnMetadata.isSingleValue()) {
+            // Single-value column.
+            for (int i = 0; i < numDocs; i++) {
+              creator.add(forwardIndexReader.getDictId(i, readerContext));
+            }
+          } else {
+            // Multi-value column.
+            int[] dictIds = new 
int[columnMetadata.getMaxNumberOfMultiValues()];
+            for (int i = 0; i < numDocs; i++) {
+              int length = forwardIndexReader.getDictIdMV(i, dictIds, 
readerContext);
+              creator.add(dictIds, length);
+            }
+          }
+          creator.seal();
+        }
+      }
+    } else {
+      // Raw value based inverted index
+      IndexReaderFactory<ForwardIndexReader> readerFactory = 
StandardIndexes.forward()
+          .getReaderFactory();
+      try (RawValueBitmapInvertedIndexCreator creator = new 
RawValueBitmapInvertedIndexCreator(context);
+          ForwardIndexReader forwardIndexReader = readerFactory
+              .createIndexReader(segmentWriter, 
_fieldIndexConfigs.get(columnName), columnMetadata);
           ForwardIndexReaderContext readerContext = 
forwardIndexReader.createContext()) {
         if (columnMetadata.isSingleValue()) {
-          // Single-value column.
-          for (int i = 0; i < numDocs; i++) {
-            creator.add(forwardIndexReader.getDictId(i, readerContext));
+          // Single-value column
+          switch (columnMetadata.getDataType()) {
+            case INT:
+              for (int i = 0; i < numDocs; i++) {
+                creator.add(forwardIndexReader.getInt(i, readerContext));
+              }
+              break;
+            case LONG:
+              for (int i = 0; i < numDocs; i++) {
+                creator.add(forwardIndexReader.getLong(i, readerContext));
+              }
+              break;
+            case FLOAT:
+              for (int i = 0; i < numDocs; i++) {
+                creator.add(forwardIndexReader.getFloat(i, readerContext));
+              }
+              break;
+            case DOUBLE:
+              for (int i = 0; i < numDocs; i++) {
+                creator.add(forwardIndexReader.getDouble(i, readerContext));
+              }
+              break;
+            case STRING:
+              for (int i = 0; i < numDocs; i++) {
+                creator.add(forwardIndexReader.getString(i, readerContext));
+              }
+              break;
+            default:
+              throw new IllegalStateException(
+                  "Unsupported data type for raw inverted index: " + 
columnMetadata.getDataType());
           }
         } else {
-          // Multi-value column.
-          int[] dictIds = new int[columnMetadata.getMaxNumberOfMultiValues()];
-          for (int i = 0; i < numDocs; i++) {
-            int length = forwardIndexReader.getDictIdMV(i, dictIds, 
readerContext);
-            creator.add(dictIds, length);
-          }
+          // Multi-value columns not supported for raw inverted index
+          throw new IllegalStateException("Raw inverted index not supported 
for multi-value columns: " + columnName);

Review Comment:
   The error message should be more specific about why multi-value columns 
aren't supported for raw inverted indexes and potentially suggest using 
dictionary encoding as an alternative.
   ```suggestion
             throw new IllegalStateException("Raw inverted index is not 
supported for multi-value columns: " + columnName +
                 ". This is because raw inverted indexes require a single value 
per document and do not use a dictionary. " +
                 "Please enable dictionary encoding for this column to use an 
inverted index.");
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/inv/RawValueBitmapInvertedIndexCreator.java:
##########
@@ -0,0 +1,334 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.local.segment.creator.impl.inv;
+
+import java.io.File;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.RandomAccessFile;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.nio.channels.FileChannel;
+import java.util.TreeMap;
+import javax.annotation.Nullable;
+import org.apache.commons.io.FileUtils;
+import 
org.apache.pinot.segment.local.segment.creator.impl.SegmentDictionaryCreator;
+import org.apache.pinot.segment.spi.V1Constants;
+import org.apache.pinot.segment.spi.creator.IndexCreationContext;
+import 
org.apache.pinot.segment.spi.index.creator.RawValueBasedInvertedIndexCreator;
+import org.apache.pinot.segment.spi.memory.PinotDataBuffer;
+import org.apache.pinot.spi.data.DimensionFieldSpec;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.FieldSpec.DataType;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+/**
+ * Creator for raw value bitmap inverted index.
+ * File format:
+ * <ul>
+ *   <li>Header (44 bytes):</li>
+ *   <ul>
+ *     <li>version (4 bytes)</li>
+ *     <li>cardinality (4 bytes)</li>
+ *     <li>maxLength (4 bytes) - for string/bytes data types</li>
+ *     <li>dictionaryOffset (8 bytes)</li>
+ *     <li>dictionaryLength (8 bytes)</li>
+ *     <li>invertedIndexOffset (8 bytes)</li>
+ *     <li>invertedIndexLength (8 bytes)</li>
+ *   </ul>
+ *   <li>Dictionary data</li>
+ *   <li>Inverted index data</li>
+ * </ul>
+ */
+public class RawValueBitmapInvertedIndexCreator implements 
RawValueBasedInvertedIndexCreator {
+  private static final int VERSION = 1;
+  private static final int HEADER_LENGTH = 44;

Review Comment:
   HEADER_LENGTH is documented as 36 bytes in RawValueBitmapInvertedIndexReader 
(line 49) but defined as 44 bytes here. The actual header structure shown in 
both files consists of: version(4) + cardinality(4) + maxLength(4) + 
dictionaryOffset(8) + dictionaryLength(8) + invertedIndexOffset(8) + 
invertedIndexLength(8) = 44 bytes. The documentation in 
RawValueBitmapInvertedIndexReader should be corrected to 44 bytes.



##########
pinot-common/src/main/java/org/apache/pinot/common/tier/TimeBasedTierSegmentSelector.java:
##########
@@ -47,6 +47,9 @@ public boolean selectSegment(String tableNameWithType, 
SegmentZKMetadata segment
 
     // get segment end time to decide if segment gets selected
     long endTimeMs = segmentZKMetadata.getEndTimeMs();
+    if (endTimeMs < 0) {
+      return System.currentTimeMillis() - segmentZKMetadata.getCreationTime() 
> _segmentAgeMillis;
+    }
     Preconditions.checkState(endTimeMs > 0, "Invalid endTimeMs: %s for 
segment: %s of table: %s", endTimeMs,
         segmentZKMetadata.getSegmentName(), tableNameWithType);

Review Comment:
   The logic at line 50 checks if `endTimeMs < 0` and handles it, but then line 
53 still requires `endTimeMs > 0` with a Preconditions check. This means 
`endTimeMs == 0` would bypass the fallback logic and fail the precondition 
check. The precondition should be removed since the fallback already handles 
invalid endTimeMs values.
   ```suggestion
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [DRAFT]Adding support for RawValueBasedInvertedIndex and custom metadata upl… [pinot]

Reply via email to