Dandandan opened a new pull request, #21595: URL: https://github.com/apache/datafusion/pull/21595
## Which issue does this PR close? Related to improving GROUP BY performance on large datasets (e.g., ClickBench). ## Rationale for this change The `AggregateExec(Partial)` accumulates ALL input rows into a single hash table before emitting. For high-cardinality GROUP BY queries (e.g., `GROUP BY URL` with 15M distinct values), the hash table grows far beyond CPU cache capacity, causing poor cache performance during hash table lookups. ## What changes are included in this PR? **Two-generation early emission:** When the partial aggregate's hash table exceeds a configurable size threshold (default: 4MB), it uses a generational scheme: 1. **Hot table fills up** → emit the **cold batch** (previous generation) downstream 2. **Promote** current hot table state → cold batch 3. **Reset** hot hash table and continue reading input This gives recurring groups a second chance to be merged locally before being sent downstream. At end-of-input, the remaining hot state and cold batch are concatenated and emitted together. **Benefits:** - Hash table stays small (~4MB) → fits in L3 cache → fewer cache misses - Two generations means groups appearing across batch windows get merged locally - Low-cardinality GROUP BY: threshold never exceeded → no change in behavior - High-cardinality GROUP BY: hash table stays cache-friendly, still aggregates (unlike `skip_partial_aggregation` which stops aggregating entirely) **New config:** `datafusion.execution.partial_aggregation_max_table_size` (default: 4MB, 0 to disable) ## Are these changes tested? All existing aggregate SLT tests pass with the default threshold of 4MB enabled. The feature is designed to be transparent — correctness is maintained because `FinalPartitioned` correctly merges multiple partial emissions. ## Are there any user-facing changes? New configuration option `datafusion.execution.partial_aggregation_max_table_size`. The default (4MB) is enabled automatically. Set to 0 to disable. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
