walterddr opened a new pull request, #9064: URL: https://github.com/apache/pinot/pull/9064
context https://github.com/apache/pinot/issues/8966 Description == - This PR fixes multi-stage engine treatment for ERROR / END_OF_STREAM and other special data blocks. - This PR also leverage the newly available DataSchema in StagePlanNode for early exception detection. Details == when exception occurs or particular stream of data comes to an end. multi-stage engine mailbox needs to know to properly close the connection stream. Previously it was done ad-hoc-ly in different parts of the code. This PR unifies the approach. 1. regardless of whether leaf or intermediate stage execution, all mailbox transfer is done via TransferableBlock operators - for leaf stage, it transfers a dataBlock (if applicable) and an end-of-stream block with or w/o exceptions - for intermediate stage, each `BaseOperator.getNextBlock()` will return a dataBlock, also guarantees no more data will be transported if an end-of-stream block is reached. 2. when error block is encounter, all current processing will be halt and error block will be repackaged and return to caller. - in the case of mailbox transport, it will be broadcast to all downstream receiving mailbox, regardless of what the desired distribution method is. 3. dataSchema is pre-parsed from the StageNode so that if any dataBlock contains dataSchema that are incompatible with the desired output, it will error out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org