kosiew commented on code in PR #22903:
URL: https://github.com/apache/datafusion/pull/22903#discussion_r3425672760
##########
datafusion/optimizer/src/optimizer.rs:
##########
@@ -604,15 +617,14 @@ impl Optimizer {
while i < options.optimizer.max_passes {
log_plan(&format!("Optimizer input (pass {i})"), &new_plan);
- // Check once per pass whether the plan contains subquery
- // expressions. When there are no subqueries, we use the
- // cheaper `rewrite` traversal instead of
- // `rewrite_with_subqueries`, avoiding the per-node
- // map_subqueries call that walks all expression trees
- // via ownership-based transform_down.
- let has_subqueries = plan_has_subqueries(&new_plan);
-
for rule in &self.rules {
+ // Re-check for each rule: early rules can remove subquery
+ // expressions, allowing later rules to use the cheaper
in-place
+ // traversal in the same optimizer pass. This is also
+ // correctness-sensitive: the in-place path refreshes parent
+ // schemas after child schemas change.
+ let has_subqueries = plan_has_subqueries(&new_plan);
Review Comment:
`plan_has_subqueries` now runs once per rule, even after it has already
returned `false`. Correctness still looks fine, but for the common case where
there are no subqueries, this changes the old once-per-pass scan into roughly
one full plan scan per optimizer rule.
Could we cache the flag for the pass and refresh it only when a rule
actually changes the plan? For example: `let mut has_subqueries =
plan_has_subqueries(&new_plan);` and then `if transformed { has_subqueries =
plan_has_subqueries(&new_plan); }`. That should keep the behavior where
subqueries removed mid-pass are noticed, while still staying safe if a rule
introduces a subquery.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]