itschrispeck opened a new pull request, #12339: URL: https://github.com/apache/pinot/pull/12339
**Motivation:** Query performance against the Lucene index suffers when chaining multiple `text_match` predicates together. Our users often programmatically generate their queries, which exacerbates the issue as 10s/100s of `text_match` predicates can be included in a single query. Because of this, users are required to understand Pinot's Lucene implementation details for them to compose an efficient query. To remove this requirement, this PR adds a `TextMatchFilterOptmizer` that performs the optimization automatically. **Summary:** This functionality is best understood through the unit testcases. In short: - Merge all AND's and OR's `text_match` operands when possible, without affecting query accuracy - Push down NOT into Lucene, unless all `text_match` filters are inversed, then the NOT expression remains in Pinot **Open question:** There is one edge case (that I can think of) where this optimization can hurt performance: if there are a number of `text_match OR text_match OR text_match` etc, early termination when `limit` is reached might take longer since the entire merged `text_match` query must now complete. For this reason, it might be prudent to put this behind a query option (or add a query option to disable it). Alternatively, the `LuceneDocIdCollector` could early terminate (but doesn't have the required context). Testing: unit tests (query performance separately verified via running the optimized vs unoptimized queries) tags: `feature`, `performance` (?) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org