github-actions[bot] commented on code in PR #63379:
URL: https://github.com/apache/doris/pull/63379#discussion_r3419555208
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/EliminateOrderByKey.java:
##########
@@ -89,7 +90,21 @@ private static Plan eliminateWindow(LogicalWindow<Plan>
window) {
for (OrderExpression orderExpression : orderExpressions) {
orderKeys.add(orderExpression.getOrderKey());
}
- List<OrderKey> retainExpression = eliminate(dataTrait, orderKeys);
+ // an order key that repeats one of the window's own partition
keys is constant within each
+ // partition, so ordering by it is redundant and can be pruned (in
addition to the data-trait
+ // based elimination below).
+ Set<Expression> partitionKeyConstants =
ImmutableSet.copyOf(windowExpression.getPartitionKeys());
+ List<OrderKey> retainExpression = eliminate(dataTrait, orderKeys,
partitionKeyConstants);
+ // An explicit ROWS frame requires an ORDER BY, and its result
depends on the physical row
+ // sequence that ORDER BY defines. The default frame is RANGE, so
a ROWS frame is always
+ // user-written. Pruning away every order key would leave a ROWS
frame without an ORDER BY,
+ // violating the analyzer invariant in
WindowFunctionChecker.checkWindowFrameBeforeFunc and
+ // changing the row-sequence contract before translation. Keep one
(redundant) order key in
+ // that case. Ranking functions and RANGE/default frames are
unaffected: dropping an order key
+ // that is constant within the partition is safe there (it is what
lets partition topn push down).
Review Comment:
This guard is broader than the comment says because it runs after
`CheckAndStandardizeWindowFunctionAndFrame`: `row_number()` is assigned a
default `ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW` frame even when the
user did not write a frame. For `row_number() over (partition by a order by
a)`, `eliminate(...)` correctly returns an empty list because `a` is constant
within each partition, but `hasRowsFrame(windowExpression)` puts `a` back.
`CreatePartitionTopNFromWindow` then inherits `ORDER BY a`, and
`PhysicalPlanTranslator` still sends that redundant partition key to
`PartitionSortNode.SortInfo`, so the duplicate-only `row_number` case in the
PR's stated goal remains unfixed. This is distinct from the earlier
explicit-ROWS aggregate thread: those user-written frames need to preserve an
order key, while default ROWS ranking windows should still be able to prune, or
this rule needs to know whether the ROWS frame was user-written. Please add
coverage for `row_number() over(par
tition by a order by a)` and keep the PartitionTopN SortInfo empty for that
case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]