gortiz opened a new pull request, #13605: URL: https://github.com/apache/pinot/pull/13605
We have recently upgrade our Calcite dependency from 1.31 to 1.37, which impacted some queries negatively. Specifically multi-stage queries using `IN` with a very large set are spending too much time on broker trying to optimize the query. When using close to 500 entries in the IN set my personal computer was spending close to 6 seconds just to optimize the query. That was due to some new optimizations added in Calcite 1.32 but mainly due to the way we were using Calcite. Specifically, we were expanding all SEARCH expressions into ORs. That may have been needed in the past when PIPELINE_BREAKER was not implemented in Pinot, but right now it doesn't seem to be needed. But even if we need it sometimes, it is not acceptable to spent so much time on optimization phase. Given it doesn't look like Calcite optimizations can be turned off, this PR changes Pinot to: 1. Use Calcite default `inSubQueryThreshold`, which is 20. 2. Modify `PinotFilterExpandSearchRule` so SEARCH expressions are not expanded when they are not range based and the number of elements is larger than 20. We may add in the future a way to configure that threshold with some config parameter. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org