timsaucer opened a new issue, #21411:
URL: https://github.com/apache/datafusion/issues/21411

   ### Describe the bug
   
   The ResolveGroupingFunction analyzer rule does not recognize grouping() 
calls when they are wrapped in an Alias node. This means that 
grouping(col).alias("name") fails at physical planning with:
   
   ```
   This feature is not implemented: physical plan is not yet implemented for 
GROUPING aggregate function
   ```
   
   The same query succeeds when the grouping() expression is not aliased.
   
   ### To Reproduce
   
   Using SQL this works correctly because the SQL planner applies the alias at 
a different stage:
   
   ```sql
   SELECT a, SUM(b) AS s, GROUPING(a) AS g
   FROM t
   GROUP BY ROLLUP(a)
   ```
   
   But when constructing the equivalent logical plan programmatically (e.g., 
via the DataFrame API), wrapping the `grouping()` expression in 
`Expr::Alias(...)` before passing it to `LogicalPlanBuilder::aggregate()` 
causes the `ResolveGroupingFunction` rule to skip it. The rule appears to 
pattern-match on `Expr::AggregateFunction` but does not recurse into 
`Expr::Alias(Alias { expr: Expr::AggregateFunction(...), .. })`.
   
   This test should pass:
   
   ```rust
       #[tokio::test]
       async fn test_grouping_function_alias() -> Result<()> {
           let ctx = SessionContext::default();
           let rb = record_batch!(("a", Int32, [1, 1, 2]), ("b", Int32, [10, 
20, 30]))?;
           let df = ctx.read_batch(rb)?;
   
           fn check_results(results: &Vec<RecordBatch>) {
               // We have no guarantee on ordering of the batches
               for result in results {
                   let s_array = 
result.column(1).as_any().downcast_ref::<Int64Array>().unwrap();
                   let expected_val = match s_array.value(0) {
                       30 => { 0 },
                       60 => { 1 },
                       _ => {
                           panic!("unexpected value {}", s_array.value(0)) }
                   };
                   let expected = create_array!(Int32, [expected_val]) as 
ArrayRef;
                   assert_eq!(&expected, result.column(2));
               }
           }
   
           let results = df.clone().aggregate(vec![
               Expr::GroupingSet(GroupingSet::Rollup(vec![col("a")])),
           ], vec![
               sum(col("b")).alias("s"),
               grouping(col("a"))
           ])?.collect().await?;
           check_results(&results);
   
   
           let results = df.clone().aggregate(vec![
               Expr::GroupingSet(GroupingSet::Rollup(vec![col("a")])),
           ], vec![
               sum(col("b")).alias("s"),
               grouping(col("a")).alias("g")
           ])?.collect().await?;
           check_results(&results);
           
           Ok(())
       }
   ```
   
   ### Expected behavior
   
   `grouping(col).alias("name")` should work identically to an unaliased 
grouping(col) — the analyzer should unwrap Alias nodes when searching for 
grouping() calls to rewrite.
   
   ### Additional context
   
   The relevant rule is ResolveGroupingFunction in 
`datafusion/optimizer/src/analyzer/resolve_grouping_function.rs`. The fix 
likely involves matching on Expr::Alias(Alias { expr, .. }) and recursing into 
the inner expression, then re-wrapping the rewritten result in the alias.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to