wirybeaver commented on issue #12080: URL: https://github.com/apache/pinot/issues/12080#issuecomment-2041746962
I did' have bandwidth to fully interpret the Druid groupBy algorithm. But here is the headline version: let say there are 8 workers. The segments are distributed across 8 workers. Each worker has a local hash table which use linear probing to resolve hash collision. Each worker is in charge of storing aggregated result for one partition. If a worker find out the aggregated key doesn't belong to the local partition, the worker would insert the aggregated row to other worker's hash table. If any one local hash table doesn't trigger the disk spilled condition, then there's no need to merge partition. However, if any of table trigger the disk spilled condition, then each worker would not distribute the aggregated row anymore. Each worker just do pre-aggregation internally. In the combination phase, the result would be merged on the fly, combining the result of each hash table's in-memory and on-disk result. The DuckDB's has similar idea even though there are nuances: https://duckdb.org/2024/03/29/external-aggregation.html Each thread do pre-aggregation without distributing result at all and then combine result. They also user linear probing with salt. linear probing is more disk friendly. To support resize efficiently, they implement a two-part aggregate hash table https://duckdb.org/2022/03/07/aggregate-hashtable The Pinot groupBy currently doesn't support external aggregation and all threads write into a in-memory single concurrent hash map. The bustub implements a disk based hash table and use latch crabbing to increase concurrency. https://15445.courses.cs.cmu.edu/spring2024/project2/ @chenboat For your question, I would say achieve low-memory usage purpose for groupby. To be general, each operator can support disk based mode, e.g., external sorting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org