gortiz commented on code in PR #14507:
URL: https://github.com/apache/pinot/pull/14507#discussion_r1900919744


##########
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/physical/colocated/GreedyShuffleRewriteVisitor.java:
##########
@@ -209,24 +209,43 @@ public Set<ColocationKey> 
visitMailboxSend(MailboxSendNode node, GreedyShuffleRe
 
     boolean canSkipShuffleBasic = colocationKeyCondition(oldColocationKeys, 
distributionKeys);
     // If receiver is not a join-stage, then we can determine distribution 
type now.
-    if (!context.isJoinStage(node.getReceiverStageId())) {
+    Iterable<Integer> receiverStageIds = node.getReceiverStageIds();
+    if (noneIsJoin(receiverStageIds, context)) {
       Set<ColocationKey> colocationKeys;
-      if (canSkipShuffleBasic && areServersSuperset(node.getReceiverStageId(), 
node.getStageId())) {
+      if (canSkipShuffleBasic && allAreSuperSet(receiverStageIds, node)) {

Review Comment:
   Otherwise we cannot apply the shuffle optimization.
   
   This means that if we find two stages that are equivalent but one can be 
optimized with colocated join while the other cannot, we need to decide whether 
we want to apply spool or colocated.
   
   Which one is better? I'm not sure. Probably we will need data to understand 
the difference. In theory if we don't apply spooling, we are going to end up 
executing the sender stage twice. In one of them we are going to skip the 
shuffle, but in the other we are going to shuffle anyway. Therefore the 
asymptotic cost will be the same. If we apply spooling, the same amount of data 
will be shuffled but we would end up doing less work because the sender stage 
would be executed only once.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to