yashmayya opened a new pull request, #14615: URL: https://github.com/apache/pinot/pull/14615
- 2x performance improvement by eliminating optimization overhead in `CoreRules.FILTER_REDUCE_EXPRESSIONS` from huge number of filter predicates created by `PinotFilterExpandSearchRule` (`IN` -> `OR`, see https://github.com/apache/pinot/issues/13617). - This doesn't completely eliminate all the query planning overhead since there's still the issue from the `SqlToRelConverter` where we're converting `IN` to `OR` no matter the size of the `IN` list (see discussion in https://issues.apache.org/jira/browse/CALCITE-6467). The default Calcite threshold is 20 beyond which the `IN` is converted to a join with a static table; however, this causes query execution overhead for us since we aren't currently optimizing such joins and might need to pay unnecessary data shuffling cost. - The query compilation test added here took ~16s to run locally prior to this change and ~8s after this change. Here are the CPU flamegraphs:   - This PR is a draft because this solution is not perfect and is leading to a lot of messy situations. The `FILTER_REDUCE_EXPRESSIONS` rule is creating `SEARCH` operators with only literal arguments in some cases and while this patch updates the `RexNode` -> `RexExpression` logic to also be able to evaluate such conditions, this could still lead to some problematic queries potentially where there is a single filter condition in a leaf stage that is reduced to a boolean for instance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org