neilconway commented on code in PR #21058:
URL: https://github.com/apache/datafusion/pull/21058#discussion_r3047128518
##########
datafusion/optimizer/src/analyzer/resolve_grouping_function.rs:
##########
@@ -207,17 +207,21 @@ fn grouping_function_on_id(
Expr::Literal(ScalarValue::from(value as u64), None)
}
};
-
let grouping_id_column =
Expr::Column(Column::from(Aggregate::INTERNAL_GROUPING_ID));
- // The grouping call is exactly our internal grouping id
if args.len() == group_by_expr_count
&& args
.iter()
.rev()
.enumerate()
.all(|(idx, expr)| group_by_expr.get(expr) == Some(&idx))
{
- return Ok(cast(grouping_id_column, DataType::Int32));
+ let n = group_by_expr_count;
+ // Mask the ordinal bits above position `n` so only the semantic
bitmask is visible.
+ // (1 << n) - 1 masks the low n bits.
+ let semantic_mask: u64 = if n >= 64 { u64::MAX } else { (1u64 << n) -
1 };
Review Comment:
`n >= 64` is rejected earlier, right? I wonder if it would be cleaner to
just assert `n < 64` and then remove the conditional.
##########
datafusion/optimizer/src/analyzer/resolve_grouping_function.rs:
##########
@@ -207,17 +207,21 @@ fn grouping_function_on_id(
Expr::Literal(ScalarValue::from(value as u64), None)
}
};
Review Comment:
It's a little unfortunate that we'll undercount the required bit width here
if there are duplicates in `group_by_expr`. Would it be easy to calculate the
true width required, including dups?
##########
datafusion/physical-plan/src/aggregates/mod.rs:
##########
@@ -1972,6 +2026,8 @@ pub fn evaluate_group_by(
group_by: &PhysicalGroupBy,
batch: &RecordBatch,
) -> Result<Vec<Vec<ArrayRef>>> {
+ let max_ordinal = max_duplicate_ordinal(&group_by.groups);
+ let mut ordinal_per_pattern: HashMap<&Vec<bool>, usize> = HashMap::new();
Review Comment:
`HashMap<&[bool], usize>`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]