yonatan-sevenai commented on code in PR #21375:
URL: https://github.com/apache/datafusion/pull/21375#discussion_r3171329631


##########
datafusion/sql/src/unparser/plan.rs:
##########
@@ -605,7 +609,83 @@ impl Unparser<'_> {
                 if self.dialect.unnest_as_lateral_flatten() {
                     Self::collect_flatten_aliases(p.input.as_ref(), select);
                 }
-                self.reconstruct_select_statement(plan, p, select)?;
+                let found_agg = self.reconstruct_select_statement(plan, p, 
select)?;
+
+                // If the Projection claimed an Aggregate by reaching through
+                // one or more Limit / Sort nodes, fold their clauses into the
+                // current query and skip those nodes during recursion.
+                // Otherwise the Limit/Sort arm would see `already_projected`
+                // and wrap everything in a spurious derived subquery, emitting
+                // the aggregate twice.
+                //
+                // Handles arbitrary stacks of Limit/Sort (e.g. Projection →
+                // Limit → Sort → Aggregate) by peeling nodes in a loop until
+                // we reach something other than a Limit or Sort. Each clause
+                // is folded at most once; if a second Limit or Sort is

Review Comment:
   Good point. Repeated modifiers no longer fall back into the 
duplicate-aggregate shape. The loop now merges them instead of breaking:
   
     - Stacked Limit nodes combine via combine_limit (the same rule 
PushDownLimit applies in the optimizer) when both skip and fetch are integer 
literals — accumulated in combined_skip / combined_fetch and flushed onto the 
QueryBuilder once after the walk. A Sort.fetch on the way down is fed in as a 
virtual child Limit(0, fetch).
     - Stacked Sort nodes keep the outermost (top) one; an inner Sort is 
reordered by the outer one and is dropped.
     - Only a non-literal Limit following another Limit still breaks out - 
there's no safe cross-dialect way to merge expression of a limit with 
non-Literal following a Literal.  This can include weird cases like a Limit on 
the result of a subquery or Limit on arithmetic function / UDF.  When that 
happens, the code recurses into the sub expression which may still 
double-consume the expression.  I'm not sure how critical is it to fix it, but 
a fix would require bringing back a subquery after already consuming the 
aggregate...
   
   New tests cover the cases this matters in tests.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to