gortiz opened a new pull request, #13605:
URL: https://github.com/apache/pinot/pull/13605

   We have recently upgrade our Calcite dependency from 1.31 to 1.37, which 
impacted some queries negatively.
   
   Specifically multi-stage queries using `IN` with a very large set are 
spending too much time on broker trying to optimize the query. When using close 
to 500 entries in the IN set my personal computer was spending close to 6 
seconds just to optimize the query.
   
   That was due to some new optimizations added in Calcite 1.32 but mainly due 
to the way we were using Calcite. Specifically, we were expanding all SEARCH 
expressions into ORs. That may have been needed in the past when 
PIPELINE_BREAKER was not implemented in Pinot, but right now it doesn't seem to 
be needed. But even if we need it sometimes, it is not acceptable to spent so 
much time on optimization phase.
   
   Given it doesn't look like Calcite optimizations can be turned off, this PR 
changes Pinot to:
   1. Use Calcite default `inSubQueryThreshold`, which is 20.
   2. Modify `PinotFilterExpandSearchRule` so SEARCH expressions are not 
expanded when they are not range based and the number of elements is larger 
than 20.
   
   We may add in the future a way to configure that threshold with some config 
parameter.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to