walterddr opened a new pull request, #10006:
URL: https://github.com/apache/pinot/pull/10006

   Refactor combine operator API changes. 
   
   Goal
   ===
   The goal of this change is to easily plug stream combine operators into the 
mix, this is useful for both GRPCServer as well as the leaf-stage operator 
chain in multistage engine. 
   
   API changes
   ===
   There's 2 part of the API changes
   1. `CombineOperator`
   Splitting the `BaseCombineOperator` into:
     - CombineOperator
       - BaseCombineOperator
         - BaseSingleBlockCombineOperator
           - [Aggregate/Distinct/Selection]CombineOperator
         - [GroupBy/MinMax]SingleBlockCombineOperator
         - BaseStreamingCombineOperator
           - SelectionOnlyCombineStreamingOperator
           - [Selection/Aggregate/Distinct]StreamingOperator
   
   2. `CombineFunction`
   Splitting the `Aggregate/Distinct/Selection` CombineOperator's merge methods 
into its own `CombineFunction` method which only has the mergeResult related 
APIs. 
     - CombineFunction
       - [Selection/Aggregate/Distinct]CombineFunctions
   
   
   Details
   ===
   1. `CombineOperator` is an interface that requires to implement 
processSegments() and mergeResults()
   2. `BaseCombineOperator` impelments the parallel task dispatch and future 
reduce mechansim
   3. `BaseSingleBlockCombineOperator` merges all blocks into 1 before 
returning; using `CombineFunction` with a standard implementation of 
`processSemgents()` and `mergeResults()` method
     3.a GroupBy and MinMax combine operator doesn't use linked blocking queue 
to communicate between the parallel tasks and the main thread, thus they have 
their own implementation directly extends from `BaseCombineOperator`
   4. the not yet implemented `BaseStreamCombineOperator` will be returning 
multiple blocks until the `getNextBlock()` returns a metadata only data table 
which indicates all tasks has finished processing or early termination has 
reached. 
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to