kosiew opened a new pull request, #22817:
URL: https://github.com/apache/datafusion/pull/22817

   ## Which issue does this PR close?
   
   Closes #22684
   
   ## Rationale for this change
   
   `SqlToRel::aggregate` contained several implementations of the same 
post-aggregate workflow across SELECT, HAVING, QUALIFY, ORDER BY, and DISTINCT 
ON:
   
   1. Rebase expressions against aggregate output expressions.
   2. Validate rebased expressions at the aggregate boundary.
   3. Apply clause-specific diagnostics via 
`CheckColumnsMustReferenceAggregatePurpose`.
   
   Maintaining this logic in multiple places increases the risk of behavioral 
drift during future planner changes. This change consolidates the shared 
mechanics into private helper functions while preserving existing SQL 
semantics, alias handling, sort behavior, and clause-specific error messages. 
   
   ## What changes are included in this PR?
   
   * Added private helper functions for post-aggregate expression processing:
   
     * `rebase_and_validate_post_aggregate_exprs`
     * `rebase_and_validate_optional_post_aggregate_expr`
     * `rebase_and_validate_post_aggregate_sort_exprs`
     * `order_by_select_alias_expr`
   * Migrated SELECT projection post-aggregate rebasing and validation to 
shared helper logic.
   * Migrated HAVING and QUALIFY post-aggregate rebasing and validation to 
shared helper logic.
   * Migrated DISTINCT ON post-aggregate rebasing and validation to shared 
helper logic while continuing to validate against aggregate outputs plus valid 
SELECT aliases.
   * Migrated ORDER BY post-aggregate rebasing and validation to a dedicated 
helper that:
   
     * Preserves existing SELECT-alias remapping behavior.
     * Preserves sort direction and NULL ordering via `SortExpr::with_expr`.
     * Continues validating with ORDER BY-specific aggregate-boundary 
diagnostics.
   * Simplified aggregate planning code by removing duplicated rebasing and 
validation logic. 
   
   ## Are these changes tested?
   
   Yes.
   
   Added SQLLogicTest coverage in `aggregate.slt` for:
   
   * SELECT projection aggregate-boundary validation errors.
   * HAVING aggregate-boundary validation errors.
   * QUALIFY aggregate-boundary validation errors.
   * ORDER BY aggregate-boundary validation errors.
   * DISTINCT ON aggregate-boundary validation errors.
   * ORDER BY using a top-level SELECT aggregate alias after aggregation.
   * Preservation of ORDER BY sort direction and NULL ordering when remapping a 
SELECT alias.
   * DISTINCT ON using a top-level SELECT aggregate alias after aggregation. 
   
   ## Are there any user-facing changes?
   
   No intended user-facing behavior changes.
   
   This PR is a refactor that preserves existing aggregate planning semantics 
while adding regression coverage for clause-specific validation diagnostics and 
alias-handling behavior. 
   
   ## LLM-generated code disclosure
   
   This PR includes LLM-generated code and comments. All LLM-generated content 
has been manually reviewed and tested.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to