Jackie-Jiang commented on code in PR #8601:
URL: https://github.com/apache/pinot/pull/8601#discussion_r875258083


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##########
@@ -134,6 +139,20 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
       invertedIndexColumns.add(columnName);
     }
 
+    Set<String> bloomFilterColumns = new HashSet<>();
+    for (String columnName : _config.getBloomFilterCreationColumns()) {
+      Preconditions.checkState(schema.hasColumn(columnName),
+          "Cannot create bloom filter for column: %s because it is not in 
schema", columnName);
+      bloomFilterColumns.add(columnName);
+    }
+
+    Set<String> rangeIndexColumns = new HashSet<>();
+    for (String columnName : _config.getRangeIndexCreationColumns()) {
+      Preconditions.checkState(schema.hasColumn(columnName),
+          "Cannot create bloom filter for column: %s because it is not in 
schema", columnName);

Review Comment:
   Update the error message



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##########
@@ -89,6 +92,8 @@ public class SegmentColumnarIndexCreator implements 
SegmentCreator {
   private SegmentGeneratorConfig _config;
   private Map<String, ColumnIndexCreationInfo> _indexCreationInfoMap;
   private final IndexCreatorProvider _indexCreatorProvider = 
IndexingOverrides.getIndexCreatorProvider();
+  private final Map<String, BloomFilterCreator> _bloomFilterCreatorMap = new 
HashMap<>();
+  private final Map<String, CombinedInvertedIndexCreator> 
_rangeIndexFilterCreatorMap = new HashMap<>();

Review Comment:
   (minor) Let's re-order the variables a little bit for readability. Suggest 
putting them between `invertedIndex` and `textIndex`. Keep the same order in 
the handling logic



##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/SegmentGeneratorConfig.java:
##########
@@ -120,6 +120,18 @@ public enum TimeColumnType {
 
   private SegmentZKPropsConfig _segmentZKPropsConfig;
 
+  private final List<String> _bloomFilterCreationColumns = new ArrayList<>();

Review Comment:
   (minor) Suggest moving these 2 list between `_invertedIndexCreationColumns` 
and `_textIndexCreationColumns`, also move the getters along with other getters



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##########
@@ -346,6 +384,37 @@ public void indexRow(GenericRow row)
       //get dictionaryCreator, will be null if column is not dictionaryEncoded
       SegmentDictionaryCreator dictionaryCreator = 
_dictionaryCreatorMap.get(columnName);
 
+      // bloom filter
+      BloomFilterCreator bloomFilterCreator = 
_bloomFilterCreatorMap.get(columnName);
+      if (bloomFilterCreator != null) {
+        bloomFilterCreator.add((String) columnValueToIndex);

Review Comment:
   (MAJOR)
   ```suggestion
           bloomFilterCreator.add(columnValueToIndex.toString());
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##########
@@ -162,6 +181,9 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
     }
 
     // Initialize creators for dictionary, forward index and inverted index
+    IndexingConfig indexingConfig = 
_config.getTableConfig().getIndexingConfig();
+    int rangeIndexVersion =

Review Comment:
   (minor) IndexingConfig can never be null
   ```suggestion
       int rangeIndexVersion = 
_config.getTableConfig().getIndexingConfig().getRangeIndexVersion();
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##########
@@ -215,6 +237,22 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
               dictionaryCreator.getNumBytesPerEntry());
           throw e;
         }
+
+        if (bloomFilterColumns.contains(columnName)) {

Review Comment:
   (MAJOR) bloom filter can be applied to both dictionary encoded and raw index 
columns.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to