itschrispeck commented on code in PR #13199: URL: https://github.com/apache/pinot/pull/13199#discussion_r1609536931
########## pinot-core/src/main/java/org/apache/pinot/core/operator/filter/AndFilterOperator.java: ########## @@ -59,13 +60,14 @@ protected BlockDocIdSet getTrues() { protected BlockDocIdSet getFalses() { List<BlockDocIdSet> blockDocIdSets = new ArrayList<>(_filterOperators.size()); for (BaseFilterOperator filterOperator : _filterOperators) { - if (filterOperator.isResultEmpty()) { - blockDocIdSets.add(new MatchAllDocIdSet(_numDocs)); + if (_nullHandlingEnabled) { + blockDocIdSets.add( + new OrDocIdSet(Arrays.asList(filterOperator.getTrues(), filterOperator.getNulls()), _numDocs)); Review Comment: I think that makes sense as an optimization, added the cases. I don't think the second point is a regression - previously .getFalses() created an `OrDocIdSet` for every predicate anyway, so we'd see something like this: ``` OrDocIdSet(NotDocIdSet(OrDocIdSet(...), NotDocIdSet(OrDocIdSet(...)) ``` with the change in this PR it would instead be: ``` NotDocIdSet(AndDocIdSet(OrDocIdSet(...), OrDocIdSet(...))) ``` This specific query speedup is very apparent, using a quickstart dataset but w/ 60k rows I did a quick comparison: ``` q1: select count(*) from fineFoodReviews where NOT regexp_like("Text", 'happen to be allergic to it') q2: select count(*) from fineFoodReviews where NOT text_match("Text", '"happen to be allergic to it"') q3: select count(*) from fineFoodReviews where NOT (text_match("Text", '"happen to be allergic to it"') AND regexp_like("Text", 'happen to be allergic to it')) q1: 39ms q2: 5ms q3: 6ms ``` Without the change, q3 latency is >= q1. I'm happy to set up a microbench if you think it'd be helpful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org