gortiz commented on code in PR #14507: URL: https://github.com/apache/pinot/pull/14507#discussion_r1895507908
########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/exchange/BlockExchange.java: ########## @@ -84,8 +92,19 @@ public boolean send(TransferableBlock block) if (block.isSuccessfulEndOfStreamBlock()) { // Send metadata to only one randomly picked mailbox, and empty EOS block to other mailboxes int numMailboxes = _sendingMailboxes.size(); - int mailboxIdToSendMetadata = ThreadLocalRandom.current().nextInt(numMailboxes); - assert block.getQueryStats() != null; + int mailboxIdToSendMetadata; + if (block.getQueryStats() != null) { + mailboxIdToSendMetadata = ThreadLocalRandom.current().nextInt(numMailboxes); + if (LOGGER.isTraceEnabled()) { + LOGGER.trace("Sending EOS metadata. Only mailbox #{} will get stats", mailboxIdToSendMetadata); + } + } else { + if (LOGGER.isTraceEnabled()) { + LOGGER.trace("Sending EOS metadata. No stat will be sent"); + } + // this may happen when the block exchange is itself used as a sending mailbox, like when using spools + mailboxIdToSendMetadata = -1; Review Comment: I think it is easier to see with a diagram. This is how data is distributed: ```mermaid flowchart BT SendOperator --> BlockExchange BlockExchange -- broadcast --> ActualSender1 BlockExchange -- broadcast --> ActualSender2 ActualSender1 -- distributionSpecific --> ReceiveOperator1.w1 ActualSender1 -- distributionSpecific --> ReceiveOperator1.w2 ActualSender2 -- distributionSpecific --> ReceiveOperator2.w1 ActualSender2 -- distributionSpecific --> ReceiveOperator2.w2 ``` This is how stats are distributed: ```mermaid flowchart BT SendOperator --> BlockExchange BlockExchange -- random (single) --> ActualSender1 BlockExchange -- random (single) --> ActualSender2 ActualSender1 -- random (single) --> ReceiveOperator1.w1 ActualSender1 -- random (single) --> ReceiveOperator1.w2 ActualSender2 -- random (single) --> ReceiveOperator2.w1 ActualSender2 -- random (single) --> ReceiveOperator2.w2 ``` So only 1 worker will receive the stats. For example: ```mermaid flowchart BT SendOperator --> BlockExchange BlockExchange -- no stats --> ActualSender1 BlockExchange -- stats --> ActualSender2 ActualSender1 -- no stats --> ReceiveOperator1.w1 ActualSender1 -- no stats --> ReceiveOperator1.w2 ActualSender2 -- stats --> ReceiveOperator2.w1 ActualSender2 -- no stats --> ReceiveOperator2.w2 linkStyle 0 stroke-width:2px,fill:none,stroke:red; linkStyle 2 stroke-width:2px,fill:none,stroke:red; linkStyle 5 stroke-width:2px,fill:none,stroke:red; ``` This is required because stats are later aggregated to be presented to the user as a tree where worker information is aggregated (usually added). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org