[GitHub] [pinot] richardstartin commented on a change in pull request #8411: accelerate counts over filters

GitBox Tue, 29 Mar 2022 12:46:09 -0700


richardstartin commented on a change in pull request #8411:
URL: https://github.com/apache/pinot/pull/8411#discussion_r837851321




##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/SortedIndexBasedFilterOperator.java
##########
@@ -132,6 +133,96 @@ protected FilterBlock getNextBlock() {
     }
   }
 
+  @Override
+  public boolean canOptimizeCount() {
+    return true;
+  }
+
+  @Override
+  public int getNumMatchingDocs() {
+    int count = 0;
+    boolean exclusive = _predicateEvaluator.isExclusive();
+    if (_predicateEvaluator instanceof 
SortedDictionaryBasedRangePredicateEvaluator) {
+      // For RANGE predicate, use start/end document id to construct a new 
document id range
+      SortedDictionaryBasedRangePredicateEvaluator rangePredicateEvaluator =
+          (SortedDictionaryBasedRangePredicateEvaluator) _predicateEvaluator;
+      int startDocId = 
_sortedIndexReader.getDocIds(rangePredicateEvaluator.getStartDictId()).getLeft();
+      // NOTE: End dictionary id is exclusive in 
OfflineDictionaryBasedRangePredicateEvaluator.
+      int endDocId = 
_sortedIndexReader.getDocIds(rangePredicateEvaluator.getEndDictId() - 
1).getRight();
+      count = endDocId - startDocId + 1;
+    } else {
+      int[] dictIds =
+          exclusive ? _predicateEvaluator.getNonMatchingDictIds() : 
_predicateEvaluator.getMatchingDictIds();
+      int numDictIds = dictIds.length;
+      // NOTE: PredicateEvaluator without matching/non-matching dictionary ids 
should not reach here.
+      Preconditions.checkState(numDictIds > 0);
+      if (numDictIds == 1) {
+        IntPair docIdRange = _sortedIndexReader.getDocIds(dictIds[0]);
+        count = docIdRange.getRight() - docIdRange.getLeft() + 1;
+      } else {
+        // Sort the dictIds in ascending order so that their respective 
docIdRanges are adjacent if they are adjacent
+        Arrays.sort(dictIds);
+        IntPair lastDocIdRange = _sortedIndexReader.getDocIds(dictIds[0]);
+        for (int i = 1; i < numDictIds; i++) {
+          IntPair docIdRange = _sortedIndexReader.getDocIds(dictIds[i]);
+          if (docIdRange.getLeft() == lastDocIdRange.getRight() + 1) {
+            lastDocIdRange.setRight(docIdRange.getRight());
+          } else {
+            count += lastDocIdRange.getRight() - lastDocIdRange.getLeft() + 1;
+            lastDocIdRange = docIdRange;
+          }
+        }
+        count += lastDocIdRange.getRight() - lastDocIdRange.getLeft() + 1;
+      }
+    }
+    return exclusive ? _numDocs - count : count;
+  }
+
+  @Override
+  public boolean canProduceBitmaps() {
+    return true;
+  }
+
+  @Override
+  public BitmapCollection getBitmaps() {
+    MutableRoaringBitmap bitmap = new MutableRoaringBitmap();
+    boolean exclusive = _predicateEvaluator.isExclusive();
+    if (_predicateEvaluator instanceof 
SortedDictionaryBasedRangePredicateEvaluator) {
+      // For RANGE predicate, use start/end document id to construct a new 
document id range
+      SortedDictionaryBasedRangePredicateEvaluator rangePredicateEvaluator =
+          (SortedDictionaryBasedRangePredicateEvaluator) _predicateEvaluator;
+      int startDocId = 
_sortedIndexReader.getDocIds(rangePredicateEvaluator.getStartDictId()).getLeft();
+      // NOTE: End dictionary id is exclusive in 
OfflineDictionaryBasedRangePredicateEvaluator.
+      int endDocId = 
_sortedIndexReader.getDocIds(rangePredicateEvaluator.getEndDictId() - 
1).getRight();
+      bitmap.add(startDocId, endDocId + 1L);
+    } else {
+      int[] dictIds =
+          exclusive ? _predicateEvaluator.getNonMatchingDictIds() : 
_predicateEvaluator.getMatchingDictIds();
+      int numDictIds = dictIds.length;
+      // NOTE: PredicateEvaluator without matching/non-matching dictionary ids 
should not reach here.
+      Preconditions.checkState(numDictIds > 0);
+      if (numDictIds == 1) {
+        IntPair docIdRange = _sortedIndexReader.getDocIds(dictIds[0]);
+        bitmap.add(docIdRange.getLeft(), docIdRange.getRight() + 1L);
+      } else {
+        // Sort the dictIds in ascending order so that their respective 
docIdRanges are adjacent if they are adjacent

Review comment:
       No idea, the code was adapted from above. The sort can probably be 
removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [pinot] richardstartin commented on a change in pull request #8411: accelerate counts over filters

Reply via email to