gene-bordegaray commented on code in PR #22207:
URL: https://github.com/apache/datafusion/pull/22207#discussion_r3250129734


##########
datafusion/physical-plan/src/repartition/mod.rs:
##########
@@ -600,6 +600,11 @@ impl BatchPartitioner {
                     num_input_partitions,
                 ))
             }
+            Partitioning::Expr(_) => {
+                not_impl_err!(
+                    "Expression partitioning is not supported by 
RepartitionExec"
+                )
+            }

Review Comment:
   My intent wasn't for `ExprPartitioning` to be efficient execution format for 
physically repartitioning rows. I was thinking of this as partitioning for 
sources/plans that already have known partitioning and declare it to preserve 
in the plan to unlock optimizations.
   
   In follow-ups:
   - add explicit compatibility/satisfaction APIs around this metadata we can 
ask structured questions without doing row-wise linear scans. This would 
eliminate uneeded repartitions in cases where different partitioning types 
satisfy one another. 
   - keep hash repartitioning as the preferred general execution path when 
DataFusion needs to repartition arbitrary input, unless we later add a more 
specialized repartitioning strategy.
   
   Let me know thoughts on that 👍 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to