timsaucer opened a new issue, #21411:
URL: https://github.com/apache/datafusion/issues/21411
### Describe the bug
The ResolveGroupingFunction analyzer rule does not recognize grouping()
calls when they are wrapped in an Alias node. This means that
grouping(col).alias("name") fails at physical planning with:
```
This feature is not implemented: physical plan is not yet implemented for
GROUPING aggregate function
```
The same query succeeds when the grouping() expression is not aliased.
### To Reproduce
Using SQL this works correctly because the SQL planner applies the alias at
a different stage:
```sql
SELECT a, SUM(b) AS s, GROUPING(a) AS g
FROM t
GROUP BY ROLLUP(a)
```
But when constructing the equivalent logical plan programmatically (e.g.,
via the DataFrame API), wrapping the `grouping()` expression in
`Expr::Alias(...)` before passing it to `LogicalPlanBuilder::aggregate()`
causes the `ResolveGroupingFunction` rule to skip it. The rule appears to
pattern-match on `Expr::AggregateFunction` but does not recurse into
`Expr::Alias(Alias { expr: Expr::AggregateFunction(...), .. })`.
This test should pass:
```rust
#[tokio::test]
async fn test_grouping_function_alias() -> Result<()> {
let ctx = SessionContext::default();
let rb = record_batch!(("a", Int32, [1, 1, 2]), ("b", Int32, [10,
20, 30]))?;
let df = ctx.read_batch(rb)?;
fn check_results(results: &Vec<RecordBatch>) {
// We have no guarantee on ordering of the batches
for result in results {
let s_array =
result.column(1).as_any().downcast_ref::<Int64Array>().unwrap();
let expected_val = match s_array.value(0) {
30 => { 0 },
60 => { 1 },
_ => {
panic!("unexpected value {}", s_array.value(0)) }
};
let expected = create_array!(Int32, [expected_val]) as
ArrayRef;
assert_eq!(&expected, result.column(2));
}
}
let results = df.clone().aggregate(vec![
Expr::GroupingSet(GroupingSet::Rollup(vec![col("a")])),
], vec![
sum(col("b")).alias("s"),
grouping(col("a"))
])?.collect().await?;
check_results(&results);
let results = df.clone().aggregate(vec![
Expr::GroupingSet(GroupingSet::Rollup(vec![col("a")])),
], vec![
sum(col("b")).alias("s"),
grouping(col("a")).alias("g")
])?.collect().await?;
check_results(&results);
Ok(())
}
```
### Expected behavior
`grouping(col).alias("name")` should work identically to an unaliased
grouping(col) — the analyzer should unwrap Alias nodes when searching for
grouping() calls to rewrite.
### Additional context
The relevant rule is ResolveGroupingFunction in
`datafusion/optimizer/src/analyzer/resolve_grouping_function.rs`. The fix
likely involves matching on Expr::Alias(Alias { expr, .. }) and recursing into
the inner expression, then re-wrapping the rewritten result in the alias.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]