gortiz commented on code in PR #10687: URL: https://github.com/apache/pinot/pull/10687#discussion_r1176676880
########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/invertedindex/NativeMutableFSTIndex.java: ########## @@ -23,29 +23,31 @@ import org.apache.pinot.segment.local.utils.nativefst.mutablefst.MutableFST; import org.apache.pinot.segment.local.utils.nativefst.mutablefst.MutableFSTImpl; import org.apache.pinot.segment.local.utils.nativefst.utils.RealTimeRegexpMatcher; -import org.apache.pinot.segment.spi.index.mutable.MutableTextIndex; +import org.apache.pinot.segment.spi.index.reader.TextIndexReader; import org.roaringbitmap.RoaringBitmapWriter; import org.roaringbitmap.buffer.ImmutableRoaringBitmap; import org.roaringbitmap.buffer.MutableRoaringBitmap; - -public class NativeMutableFSTIndex implements MutableTextIndex { - private final String _column; +@Deprecated +public class NativeMutableFSTIndex implements TextIndexReader { Review Comment: I consciously didn't implement `MutableIndex` here. We should first try to understand whether it works or not and in case it doesn't, just remove the class ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/invertedindex/NativeMutableFSTIndex.java: ########## @@ -23,29 +23,31 @@ import org.apache.pinot.segment.local.utils.nativefst.mutablefst.MutableFST; import org.apache.pinot.segment.local.utils.nativefst.mutablefst.MutableFSTImpl; import org.apache.pinot.segment.local.utils.nativefst.utils.RealTimeRegexpMatcher; -import org.apache.pinot.segment.spi.index.mutable.MutableTextIndex; +import org.apache.pinot.segment.spi.index.reader.TextIndexReader; import org.roaringbitmap.RoaringBitmapWriter; import org.roaringbitmap.buffer.ImmutableRoaringBitmap; import org.roaringbitmap.buffer.MutableRoaringBitmap; - -public class NativeMutableFSTIndex implements MutableTextIndex { - private final String _column; Review Comment: This attribute wasn't used ########## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/AbstractIndexType.java: ########## @@ -123,4 +123,16 @@ public boolean equals(Object o) { public int hashCode() { return Objects.hash(_id); } + + /** + * Helper method that builds allocation context that includes segment name, column name, and index type. + * + * @param segmentName Name of segment. + * @param columnName Name of column. + * @param indexType Index type. + * @return Allocation context built from segment name, column name and index type. + */ + public static String buildAllocationContext(String segmentName, String columnName, String indexType) { + return segmentName + ":" + columnName + indexType; Review Comment: This is actually only used by forward and dictionary indexes. Maybe we can move to another place. Ideas? ########## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/mutable/MutableIndex.java: ########## @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + * + */ + +package org.apache.pinot.segment.spi.index.mutable; + +import javax.annotation.Nonnull; +import javax.annotation.Nullable; +import org.apache.pinot.segment.spi.index.IndexReader; + + +public interface MutableIndex extends IndexReader { + + /** + * Adds the given single value cell to the index. + * + * Rows will be added in no particular order. + * + * @param value The nonnull value of the cell. In case the cell was actually null, a default value is received instead + * @param dictId An optional dictionary value of the cell. If there is no dictionary, -1 is received + */ + void add(@Nonnull Object value, int dictId, int docId); Review Comment: This an the other method are very similar to the ones in IndexCreator methods. I've used the same design pattern (adaptor) ########## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/mutable/provider/MutableIndexContext.java: ########## @@ -18,36 +18,97 @@ */ package org.apache.pinot.segment.spi.index.mutable.provider; +import java.io.File; import java.util.Objects; +import javax.annotation.Nullable; import org.apache.pinot.segment.spi.memory.PinotDataBufferMemoryManager; -import org.apache.pinot.spi.config.table.JsonIndexConfig; import org.apache.pinot.spi.data.FieldSpec; -public interface MutableIndexContext { - PinotDataBufferMemoryManager getMemoryManager(); +public class MutableIndexContext { Review Comment: Like we did in IndexCreationContext during `index-spi`, the idea here is: - Make the interface a class - Move the logic from Common to the class - Remove Wrapper and all extra flavors. Instead of using the extra flavors, each index type receives this context and their own configuration. I also added 3 extra attributes to the class: - _estimatedColSize - _estimatedCardinality - _avgNumMultiValues As they were used by some indexes and may be useful for future indexes. ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java: ########## @@ -768,115 +679,28 @@ private void addNewRow(int docId, GenericRow row) { } } } - - // Update text index - MutableTextIndex textIndex = indexContainer._textIndex; - if (textIndex != null) { - try { - textIndex.add((String) value); - } catch (Exception e) { - recordIndexingError(FieldConfig.IndexType.TEXT, e); - } - } - - // Update json index - MutableJsonIndex jsonIndex = indexContainer._jsonIndex; - if (jsonIndex != null) { - try { - jsonIndex.add((String) value); - } catch (Exception e) { - recordIndexingError(FieldConfig.IndexType.JSON, e); - } - } - - // Update H3 index - MutableH3Index h3Index = indexContainer._h3Index; - if (h3Index != null) { - try { - h3Index.add(GeometrySerializer.deserialize((byte[]) value)); - } catch (Exception e) { - recordIndexingError(FieldConfig.IndexType.H3, e); - } - } } else { // Multi-value column int[] dictIds = indexContainer._dictIds; - indexContainer._valuesInfo.updateVarByteMVMaxRowLengthInBytes(value, dataType.getStoredType()); - - if (dictIds != null) { - // Dictionary encoded - // Update numValues info - indexContainer._valuesInfo.updateMVNumValues(dictIds.length); - - // Update forward index - indexContainer._forwardIndex.setDictIdMV(docId, dictIds); - - // Update inverted index - MutableInvertedIndex invertedIndex = indexContainer._invertedIndex; - if (invertedIndex != null) { - for (int dictId : dictIds) { - try { - invertedIndex.add(dictId, docId); - } catch (Exception e) { - recordIndexingError(FieldConfig.IndexType.INVERTED, e); - } - } - } - } else { - // Raw MV columns Review Comment: Here the only index that was supported was `forward` ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java: ########## @@ -768,115 +679,28 @@ private void addNewRow(int docId, GenericRow row) { } } } - - // Update text index - MutableTextIndex textIndex = indexContainer._textIndex; - if (textIndex != null) { - try { - textIndex.add((String) value); - } catch (Exception e) { - recordIndexingError(FieldConfig.IndexType.TEXT, e); - } - } - - // Update json index - MutableJsonIndex jsonIndex = indexContainer._jsonIndex; - if (jsonIndex != null) { - try { - jsonIndex.add((String) value); - } catch (Exception e) { - recordIndexingError(FieldConfig.IndexType.JSON, e); - } - } - - // Update H3 index - MutableH3Index h3Index = indexContainer._h3Index; - if (h3Index != null) { - try { - h3Index.add(GeometrySerializer.deserialize((byte[]) value)); - } catch (Exception e) { - recordIndexingError(FieldConfig.IndexType.H3, e); - } - } } else { // Multi-value column int[] dictIds = indexContainer._dictIds; - indexContainer._valuesInfo.updateVarByteMVMaxRowLengthInBytes(value, dataType.getStoredType()); - - if (dictIds != null) { - // Dictionary encoded Review Comment: Here only `forward` and `inverted` where supported ########## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/column/ColumnIndexContainer.java: ########## @@ -58,14 +58,14 @@ public void close() } class FromMap implements ColumnIndexContainer { - private final Map<IndexType, IndexReader> _readersByIndex; + private final Map<IndexType, ? extends IndexReader> _readersByIndex; /** * @param readersByIndex it is assumed that each index is associated with a compatible reader, but there is no check * to verify that. It is recommended to construct instances of this class by using * {@link FromMap.Builder} */ - public FromMap(Map<IndexType, IndexReader> readersByIndex) { + public FromMap(Map<IndexType, ? extends IndexReader> readersByIndex) { Review Comment: This is needed in order to also support a `Map` whose values are `MutableIndex`es ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/datasource/MutableDataSource.java: ########## @@ -44,24 +40,14 @@ public class MutableDataSource extends BaseDataSource { public MutableDataSource(FieldSpec fieldSpec, int numDocs, int numValues, int maxNumValuesPerMVEntry, int cardinality, @Nullable PartitionFunction partitionFunction, @Nullable Set<Integer> partitions, @Nullable Comparable minValue, - @Nullable Comparable maxValue, ForwardIndexReader forwardIndex, @Nullable Dictionary dictionary, - @Nullable InvertedIndexReader invertedIndex, @Nullable RangeIndexReader rangeIndex, - @Nullable TextIndexReader textIndex, @Nullable TextIndexReader fstIndex, @Nullable JsonIndexReader jsonIndex, - @Nullable H3IndexReader h3Index, @Nullable BloomFilterReader bloomFilter, - @Nullable NullValueVectorReader nullValueVector, int maxRowLengthInBytes) { + @Nullable Comparable maxValue, Map<IndexType, MutableIndex> mutableIndexes, + @Nullable MutableDictionary dictionary, @Nullable MutableNullValueVector nullValueVector, Review Comment: Remeber that MutableDictionary and MutableNullValueVector are not MutableIndexes, therefore cannot be included in `mutableIndexes` ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/forward/ForwardIndexType.java: ########## @@ -248,4 +260,61 @@ public static ForwardIndexReader<?> read(SegmentDirectory.Reader segmentReader, throws IndexReaderConstraintException, IOException { return StandardIndexes.forward().getReaderFactory().createIndexReader(segmentReader, fieldIndexConfigs, metadata); } + + @Nullable + @Override + public MutableIndex createMutableIndex(MutableIndexContext context, ForwardIndexConfig config) { + if (config.isDisabled()) { Review Comment: All this code was copied from the older `MutableForwardIndexProvider` ########## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/mutable/MutableForwardIndex.java: ########## @@ -27,7 +29,113 @@ * Interface for mutable forward index (for CONSUMING segment). * NOTE: Mutable forward index does not use reader context to accelerate the reads. */ -public interface MutableForwardIndex extends ForwardIndexReader<ForwardIndexReaderContext> { +public interface MutableForwardIndex extends ForwardIndexReader<ForwardIndexReaderContext>, MutableIndex { + + @Override + default void add(@Nonnull Object value, int dictId, int docId) { + if (dictId >= 0) { + setDictId(docId, dictId); + } else { + switch (getStoredType()) { + case INT: + setInt(docId, (int) value); + break; + case LONG: + setLong(docId, (long) value); + break; + case FLOAT: + setFloat(docId, (float) value); + break; + case DOUBLE: + setDouble(docId, (double) value); + break; + case BIG_DECIMAL: + setBigDecimal(docId, (BigDecimal) value); + break; + case STRING: + setString(docId, (String) value); + break; + case BYTES: + setBytes(docId, (byte[]) value); + break; + case JSON: + if (value instanceof String) { + setString(docId, (String) value); + } else if (value instanceof byte[]) { + setBytes(docId, (byte[]) value); + } + break; + default: + throw new IllegalStateException(); + } + } + } + + @Override + default void add(@Nonnull Object[] value, @Nullable int[] dictIds, int docId) { + if (dictIds != null) { + setDictIdMV(docId, dictIds); + } else { + int length = value.length; + // TODO: The new int[], long[], etc objects are not actually used as arrays iterated again, so we could skip the Review Comment: This comment is copied from `ForwardIndexCreator`, but also applies here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org