yonatan-sevenai commented on code in PR #21375:
URL: https://github.com/apache/datafusion/pull/21375#discussion_r3171329631
##########
datafusion/sql/src/unparser/plan.rs:
##########
@@ -605,7 +609,83 @@ impl Unparser<'_> {
if self.dialect.unnest_as_lateral_flatten() {
Self::collect_flatten_aliases(p.input.as_ref(), select);
}
- self.reconstruct_select_statement(plan, p, select)?;
+ let found_agg = self.reconstruct_select_statement(plan, p,
select)?;
+
+ // If the Projection claimed an Aggregate by reaching through
+ // one or more Limit / Sort nodes, fold their clauses into the
+ // current query and skip those nodes during recursion.
+ // Otherwise the Limit/Sort arm would see `already_projected`
+ // and wrap everything in a spurious derived subquery, emitting
+ // the aggregate twice.
+ //
+ // Handles arbitrary stacks of Limit/Sort (e.g. Projection →
+ // Limit → Sort → Aggregate) by peeling nodes in a loop until
+ // we reach something other than a Limit or Sort. Each clause
+ // is folded at most once; if a second Limit or Sort is
Review Comment:
Good point. Repeated modifiers no longer fall back into the
duplicate-aggregate shape. The loop now merges them instead of breaking:
- Stacked Limit nodes combine via combine_limit (the same rule
PushDownLimit applies in the optimizer) when both skip and fetch are integer
literals — accumulated in combined_skip / combined_fetch and flushed onto the
QueryBuilder once after the walk. A Sort.fetch on the way down is fed in as a
virtual child Limit(0, fetch).
- Stacked Sort nodes keep the outermost (top) one; an inner Sort is
reordered by the outer one and is dropped.
- Only a non-literal Limit following another Limit still breaks out -
there's no safe cross-dialect way to merge expression of a limit with
non-Literal following a Literal. This can include weird cases like a Limit on
the result of a subquery or Limit on arithmetic function / UDF. When that
happens, the code recurses into the sub expression which may still
double-consume the expression. I'm not sure how critical is it to fix it, but
a fix would require bringing back a subquery after already consuming the
aggregate...
New tests cover the cases this matters in tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]