Copilot commented on code in PR #21438:
URL: https://github.com/apache/datafusion/pull/21438#discussion_r3045064247
##########
datafusion/core/tests/dataframe/mod.rs:
##########
@@ -6852,3 +6852,38 @@ async fn
test_duplicate_state_fields_for_dfschema_construct() -> Result<()> {
Ok(())
}
+
+/// Regression test for https://github.com/apache/datafusion/issues/21411
+/// grouping() should work when wrapped in an alias via the DataFrame API.
+///
+/// This bug only manifests through the DataFrame API because `.alias()` wraps
+/// the `grouping()` call in an `Expr::Alias` node at the aggregate expression
+/// level. The SQL planner handles aliasing separately (via projection), so the
+/// `ResolveGroupingFunction` analyzer rule never sees an `Expr::Alias` wrapper
+/// around the aggregate function in SQL queries — making SQL-based tests
+/// insufficient to cover this case.
+#[tokio::test]
+async fn test_grouping_with_alias() -> Result<()> {
+ use datafusion_functions_aggregate::expr_fn::grouping;
+
+ let df = create_test_table("test")
+ .await?
+ .aggregate(vec![col("a")], vec![grouping(col("a")).alias("g")])?
+ .sort(vec![Sort::new(col("a"), true, false)])?;
Review Comment:
The new regression test covers a single alias, but the analyzer change also
enables `is_grouping_function` to match nested aliases. It would be good to add
a case like `grouping(col("a")).alias("x").alias("g")` to ensure nested alias
wrappers are handled correctly (and to prevent silent drop of the aggregate
expression).
##########
datafusion/optimizer/src/analyzer/resolve_grouping_function.rs:
##########
@@ -110,6 +110,26 @@ fn replace_grouping_exprs(
column.name,
)));
}
+ Expr::Alias(Alias {
+ expr: inner_expr,
+ relation,
+ name,
+ ..
+ }) if is_grouping_function(&expr) => {
+ if let Expr::AggregateFunction(function) = inner_expr.as_ref()
{
+ let grouping_expr = grouping_function_on_id(
+ function,
+ &group_expr_to_bitmap_index,
+ is_grouping_set,
+ )?;
+ // Preserve the user-provided alias
+ projection_exprs.push(Expr::Alias(Alias::new(
+ grouping_expr,
+ relation.clone(),
+ name.clone(),
+ )));
+ }
Review Comment:
The `Expr::Alias(...)` branch can silently drop expressions when the inner
expression is itself another `Expr::Alias` (e.g.,
`grouping(col("a")).alias("x").alias("y")`). `is_grouping_function(&expr)` will
be true (due to recursive alias handling), but `if let
Expr::AggregateFunction(function) = inner_expr.as_ref()` will fail, resulting
in neither `projection_exprs` nor `new_agg_expr` being updated for this
aggregate column (schema mismatch / wrong plan). Consider unwrapping aliases
recursively until reaching the underlying `Expr::AggregateFunction`, and if it
still isn’t one, return an error rather than doing nothing.
```suggestion
let mut unwrapped_expr = inner_expr.as_ref();
while let Expr::Alias(Alias { expr, .. }) = unwrapped_expr {
unwrapped_expr = expr.as_ref();
}
let Expr::AggregateFunction(function) = unwrapped_expr else {
return plan_err!(
"Expected grouping function alias to wrap an
aggregate function, found: {:?}",
unwrapped_expr
);
};
let grouping_expr = grouping_function_on_id(
function,
&group_expr_to_bitmap_index,
is_grouping_set,
)?;
// Preserve the user-provided alias
projection_exprs.push(Expr::Alias(Alias::new(
grouping_expr,
relation.clone(),
name.clone(),
)));
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]