itschrispeck commented on code in PR #13199:
URL: https://github.com/apache/pinot/pull/13199#discussion_r1609536931


##########
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/AndFilterOperator.java:
##########
@@ -59,13 +60,14 @@ protected BlockDocIdSet getTrues() {
   protected BlockDocIdSet getFalses() {
     List<BlockDocIdSet> blockDocIdSets = new 
ArrayList<>(_filterOperators.size());
     for (BaseFilterOperator filterOperator : _filterOperators) {
-      if (filterOperator.isResultEmpty()) {
-        blockDocIdSets.add(new MatchAllDocIdSet(_numDocs));
+      if (_nullHandlingEnabled) {
+        blockDocIdSets.add(
+            new OrDocIdSet(Arrays.asList(filterOperator.getTrues(), 
filterOperator.getNulls()), _numDocs));

Review Comment:
   I think that makes sense as an optimization, added the cases.
   
   I don't think the second point is a regression - previously .getFalses() 
created an `OrDocIdSet` for every predicate anyway, so we'd see something like 
this:
   ```
   OrDocIdSet(NotDocIdSet(OrDocIdSet(...), NotDocIdSet(OrDocIdSet(...))
   ```
   
   with the change in this PR it would instead be:
   ```
   NotDocIdSet(AndDocIdSet(OrDocIdSet(...), OrDocIdSet(...)))
   ```
   
   This specific query speedup is very apparent, using a quickstart dataset but 
w/ 60k rows I did a quick comparison:
   ```
   q1: select count(*) from fineFoodReviews where NOT regexp_like("Text", 
'happen to be allergic to it')
   q2: select count(*) from fineFoodReviews where NOT text_match("Text", 
'"happen to be allergic to it"')
   q3: select count(*) from fineFoodReviews where NOT (text_match("Text", 
'"happen to be allergic to it"') AND regexp_like("Text", 'happen to be allergic 
to it'))
   
   q1: 39ms
   q2: 5ms
   q3: 6ms
   ```
   
   Without the change, q3 latency is >= q1. I'm happy to set up a microbench if 
you think it'd be helpful. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to