github-actions[bot] commented on code in PR #63379:
URL: https://github.com/apache/doris/pull/63379#discussion_r3419555208


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/EliminateOrderByKey.java:
##########
@@ -89,7 +90,21 @@ private static Plan eliminateWindow(LogicalWindow<Plan> 
window) {
             for (OrderExpression orderExpression : orderExpressions) {
                 orderKeys.add(orderExpression.getOrderKey());
             }
-            List<OrderKey> retainExpression = eliminate(dataTrait, orderKeys);
+            // an order key that repeats one of the window's own partition 
keys is constant within each
+            // partition, so ordering by it is redundant and can be pruned (in 
addition to the data-trait
+            // based elimination below).
+            Set<Expression> partitionKeyConstants = 
ImmutableSet.copyOf(windowExpression.getPartitionKeys());
+            List<OrderKey> retainExpression = eliminate(dataTrait, orderKeys, 
partitionKeyConstants);
+            // An explicit ROWS frame requires an ORDER BY, and its result 
depends on the physical row
+            // sequence that ORDER BY defines. The default frame is RANGE, so 
a ROWS frame is always
+            // user-written. Pruning away every order key would leave a ROWS 
frame without an ORDER BY,
+            // violating the analyzer invariant in 
WindowFunctionChecker.checkWindowFrameBeforeFunc and
+            // changing the row-sequence contract before translation. Keep one 
(redundant) order key in
+            // that case. Ranking functions and RANGE/default frames are 
unaffected: dropping an order key
+            // that is constant within the partition is safe there (it is what 
lets partition topn push down).

Review Comment:
   This guard is broader than the comment says because it runs after 
`CheckAndStandardizeWindowFunctionAndFrame`: `row_number()` is assigned a 
default `ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW` frame even when the 
user did not write a frame. For `row_number() over (partition by a order by 
a)`, `eliminate(...)` correctly returns an empty list because `a` is constant 
within each partition, but `hasRowsFrame(windowExpression)` puts `a` back. 
`CreatePartitionTopNFromWindow` then inherits `ORDER BY a`, and 
`PhysicalPlanTranslator` still sends that redundant partition key to 
`PartitionSortNode.SortInfo`, so the duplicate-only `row_number` case in the 
PR's stated goal remains unfixed. This is distinct from the earlier 
explicit-ROWS aggregate thread: those user-written frames need to preserve an 
order key, while default ROWS ranking windows should still be able to prune, or 
this rule needs to know whether the ROWS frame was user-written. Please add 
coverage for `row_number() over(par
 tition by a order by a)` and keep the PartitionTopN SortInfo empty for that 
case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to