xiedeyantu opened a new pull request, #21362: URL: https://github.com/apache/datafusion/pull/21362
## Which issue does this PR close? - Closes #21361 . ## Rationale for this change This PR adds functional-dependency-based simplification for `ORDER BY` clauses. When an earlier sort key already functionally determines a later key, the later key is redundant and can be removed without changing query semantics. This reduces unnecessary sorting work and avoids carrying extra sort keys through planning and execution. ## What changes are included in this PR? This PR extends the existing functional dependency utilities with a helper for pruning redundant sort keys, and wires that helper into `eliminate_duplicated_expr` so `Sort` nodes can be simplified during optimization. It also adds regression coverage for both the positive case, where a trailing sort key is removed, and the negative case, where sort order prevents pruning. ## Are these changes tested? Yes. I added unit tests covering: - removal of a functionally redundant trailing `ORDER BY` key - preservation of ordering when the dependent column appears before its determinant I also ran `cargo test -p datafusion-optimizer eliminate_duplicated_expr -- --nocapture` successfully, and `cargo fmt --all` passes. ## Are there any user-facing changes? Yes, but only in query planning behavior. Some queries with redundant `ORDER BY` keys may produce simpler plans and run more efficiently. There are no public API changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
