gortiz commented on code in PR #10687:
URL: https://github.com/apache/pinot/pull/10687#discussion_r1188308217


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java:
##########
@@ -287,111 +282,67 @@ public boolean isMutableSegment() {
         partitions.add(config.getPartitionId());
       }
 
+      // TODO (mutable-index-spi): The comment above was here, but no check 
was done.
+      //  It seems the code that apply that check was removed around 2020. 
Should we remove the comment?
       // Check whether to generate raw index for the column while consuming
       // Only support generating raw index on single-value columns that do not 
have inverted index while
       // consuming. After consumption completes and the segment is built, all 
single-value columns can have raw index
-      DataType storedType = fieldSpec.getDataType().getStoredType();
-      boolean isFixedWidthColumn = storedType.isFixedWidth();
-      MutableIndexProvider indexProvider = 
IndexingOverrides.getMutableIndexProvider();
-      MutableForwardIndex forwardIndex = 
indexProvider.newForwardIndex(context.forForwardIndex(avgNumMultiValues));
 
       // Dictionary-encoded column
-      MutableDictionary dictionary = null;
+      MutableDictionary dictionary;
       if (isDictionary) {
-        int dictionaryColumnSize =
-            isFixedWidthColumn ? storedType.size() : 
_statsHistory.getEstimatedAvgColSize(column);
-        // NOTE: preserve 10% buffer for cardinality to reduce the chance of 
re-sizing the dictionary
-        int estimatedCardinality = (int) 
(_statsHistory.getEstimatedCardinality(column) * 1.1);
-        dictionary = 
indexProvider.newDictionary(context.forDictionary(dictionaryColumnSize, 
estimatedCardinality));
-        // Even though the column is defined as 'no-dictionary' in the config, 
we did create dictionary for consuming
-        // segment.
-        noDictionaryColumns.remove(column);
-      }
-
-      // Inverted index
-      MutableInvertedIndex invertedIndexReader =
-          invertedIndexColumns.contains(column) ? 
indexProvider.newInvertedIndex(context.forInvertedIndex()) : null;
-
-      MutableTextIndex fstIndex = null;
-      // FST Index
-      if (_fieldConfigList != null && fstIndexColumns.contains(column)) {
-        for (FieldConfig fieldConfig : _fieldConfigList) {
-          if (fieldConfig.getName().equals(column)) {
-            Map<String, String> properties = fieldConfig.getProperties();
-            if (TextIndexUtils.isFstTypeNative(properties)) {
-              fstIndex = new NativeMutableFSTIndex(column);
-            }
-          }
+        DictionaryIndexConfig dictionaryIndexConfig = 
indexConfigs.getConfig(StandardIndexes.dictionary());
+        if (dictionaryIndexConfig.isDisabled()) {
+          dictionaryIndexConfig = DictionaryIndexConfig.DEFAULT;

Review Comment:
   Yeah, I know it is strange, but that is how Pinot works. Users can 
explicitly say _I don't want to use a dictionary_ but the dictionary is created 
anyway as a side effect. In this case, isDictionary is defined as:
   
   ```java
         boolean isDictionary = !isNoDictionaryColumn(indexConfigs, fieldSpec, 
column);
   ``` 
   
   Which basically means that it will be true if:
   - Either the dictionary is enabled in the config
   - Or the dictionary is disabled in the config but any of the following is 
true:
      - `isAggregateMetricsEnabled` is true and dataType is string or bytes
      - The column is multi valuated
      - The column is variable length
      - The column has an inverted index
   
   
   It would be better to fail in that case if dictionary is disabled, but that 
is not how Pinot behave and we should not break backward compatibility



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to