walterddr opened a new pull request, #10006:
URL: https://github.com/apache/pinot/pull/10006
Refactor combine operator API changes.
Goal
===
The goal of this change is to easily plug stream combine operators into the
mix, this is useful for both GRPCServer as well as the leaf-stage operator
chain in multistage engine.
API changes
===
There's 2 part of the API changes
1. `CombineOperator`
Splitting the `BaseCombineOperator` into:
- CombineOperator
- BaseCombineOperator
- BaseSingleBlockCombineOperator
- [Aggregate/Distinct/Selection]CombineOperator
- [GroupBy/MinMax]SingleBlockCombineOperator
- BaseStreamingCombineOperator
- SelectionOnlyCombineStreamingOperator
- [Selection/Aggregate/Distinct]StreamingOperator
2. `CombineFunction`
Splitting the `Aggregate/Distinct/Selection` CombineOperator's merge methods
into its own `CombineFunction` method which only has the mergeResult related
APIs.
- CombineFunction
- [Selection/Aggregate/Distinct]CombineFunctions
Details
===
1. `CombineOperator` is an interface that requires to implement
processSegments() and mergeResults()
2. `BaseCombineOperator` impelments the parallel task dispatch and future
reduce mechansim
3. `BaseSingleBlockCombineOperator` merges all blocks into 1 before
returning; using `CombineFunction` with a standard implementation of
`processSemgents()` and `mergeResults()` method
3.a GroupBy and MinMax combine operator doesn't use linked blocking queue
to communicate between the parallel tasks and the main thread, thus they have
their own implementation directly extends from `BaseCombineOperator`
4. the not yet implemented `BaseStreamCombineOperator` will be returning
multiple blocks until the `getNextBlock()` returns a metadata only data table
which indicates all tasks has finished processing or early termination has
reached.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]