gene-bordegaray commented on code in PR #22207:
URL: https://github.com/apache/datafusion/pull/22207#discussion_r3250129734
##########
datafusion/physical-plan/src/repartition/mod.rs:
##########
@@ -600,6 +600,11 @@ impl BatchPartitioner {
num_input_partitions,
))
}
+ Partitioning::Expr(_) => {
+ not_impl_err!(
+ "Expression partitioning is not supported by
RepartitionExec"
+ )
+ }
Review Comment:
My intent wasn't for `ExprPartitioning` to be efficient execution format for
physically repartitioning rows. I was thinking of this as partitioning for
sources/plans that already have known partitioning and declare it to preserve
in the plan to unlock optimizations.
In follow-ups:
- add explicit compatibility/satisfaction APIs around this metadata we can
ask structured questions without doing row-wise linear scans. This would
eliminate uneeded repartitions in cases where different partitioning types
satisfy one another.
- keep hash repartitioning as the preferred general execution path when
DataFusion needs to repartition arbitrary input, unless we later add a more
specialized repartitioning strategy.
Let me know thoughts on that 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]