neilconway opened a new issue, #21349:
URL: https://github.com/apache/datafusion/issues/21349

   ### Describe the bug
   
   From the plan for TPC-H Q22, in the #21240 branch:
   
   ```
     ScalarSubqueryExec: subqueries=1
       RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1, 
maintains_sort_order=true
         SortPreservingMergeExec: [cntrycode@0 ASC NULLS LAST]
           SortExec: expr=[cntrycode@0 ASC NULLS LAST], 
preserve_partitioning=[true]
             ...
   ```
   
   EnforceDistribution inserts a RepartitionExec: 
partitioning=RoundRobinBatch(N) on an already-sorted single-partition output, 
followed immediately by SortPreservingMergeExec to merge them back. This is 
wasted work — the data is split and immediately re-merged.
   
   This might be related / subset of #4368, not sure exactly.
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to