shauryachats opened a new pull request, #14198: URL: https://github.com/apache/pinot/pull/14198
The current logical planner has a shortcoming for `NOT IN` subqueries where it projects unused columns before the exchange stage, resulting in increased data during shuffles and significantly increasing latency. An example would be: ``` EXPLAIN PLAN FOR SELECT count(*) FROM a WHERE a.col1 = 'foo' and a.col2 = 'bar' AND a.col4 = 12 AND a.col5 = false AND a.col3 NOT IN (SELECT b.col3 FROM b where b.col3 = 'baz') AND a.col3 NOT IN (SELECT b.col3 FROM b where b.col3 = 'qux')) ``` The execution plan is: ``` Execution Plan PinotLogicalAggregate(group=[{}], agg#0=[COUNT($0)]) PinotLogicalExchange(distribution=[hash]) PinotLogicalAggregate(group=[{}], agg#0=[COUNT()]) LogicalFilter(condition=[IS NOT TRUE($8)]) LogicalJoin(condition=[=($6, $7)], joinType=[left]) PinotLogicalExchange(distribution=[hash[6]]) LogicalProject(col1=[$0], col2=[$1], col4=[$3], col5=[$4], col30=[$5], $f1=[$7], col32=[$2]) LogicalFilter(condition=[IS NOT TRUE($7)]) LogicalJoin(condition=[=($5, $6)], joinType=[left]) PinotLogicalExchange(distribution=[hash[5]]) LogicalProject(col1=[$0], col2=[$1], col3=[$2], col4=[$3], col5=[$4], col30=[$2]) LogicalFilter(condition=[AND(=($0, _UTF-8'foo'), =($1, _UTF-8'bar'), =($3, 12), NOT($4))]) LogicalTableScan(table=[[default, a]]) PinotLogicalExchange(distribution=[hash[0]]) PinotLogicalAggregate(group=[{0}], agg#0=[MIN($1)]) PinotLogicalExchange(distribution=[hash[0]]) PinotLogicalAggregate(group=[{0}], agg#0=[MIN($1)]) LogicalProject(col3=[$2], $f1=[true]) LogicalFilter(condition=[=($2, CAST(_UTF-8'baz'):INTEGER NOT NULL)]) LogicalTableScan(table=[[default, b]]) PinotLogicalExchange(distribution=[hash[0]]) PinotLogicalAggregate(group=[{0}], agg#0=[MIN($1)]) PinotLogicalExchange(distribution=[hash[0]]) PinotLogicalAggregate(group=[{0}], agg#0=[MIN($1)]) LogicalProject(col3=[$2], $f1=[true]) LogicalFilter(condition=[=($2, CAST(_UTF-8'qux'):INTEGER NOT NULL)]) LogicalTableScan(table=[[default, b]]) ``` It can be observed from the execution plan ``` PinotLogicalExchange(distribution=[hash[5]]) LogicalProject(col1=[$0], col2=[$1], col3=[$2], col4=[$3], col5=[$4], col30=[$2]) ``` that while the join is only on `a.col3` and the final result required is a `count(*)`, other columns (`col1`, `col2`, `col4`, `col5`) are present in the `LogicalProject` step right before `PinotLogicalExchange` which should ideally have been pruned. The cause of the issue is the `LogicalFilter` which does not allow projects to be transposed, since the current rules do not have `PROJECT_FILTER_TRANSPOSE` included. Even if `PROJECT_FILTER_TRANSPOSE` is included in `BASIC_RULES`, it would not work for multiple joins since `BASIC_RULES` optimizations are run only once. This PR aims to fix this issue by running `Project` pushdown (transpose and merge operations) as a `RuleCollection` after the first iteration of `Filter` pushdown rules are run to ensure `Project` can be pushed down as much as possible. After the `Project` pushdown is run, the `Filter` pushdown is run again to push down filters after the project pushdown for two reasons: - In a generic query, filter pushdowns are generally more selective than project pushdowns and should be given more priority. - Major changes to the current logical plans should be avoided, and not running a `Filter` pushdown after `Project` pushdown results in a Project pushdown after each logical plan node which increases verbosity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org