NGA-TRAN commented on PR #21832: URL: https://github.com/apache/datafusion/pull/21832#issuecomment-4316619820
@adriangb @alamb @gene-bordegaray: Let me know if this approach is feasible: 1. Add a property on each join (boolean or integer) plus a global‑planning property indicating whether the build and probe sides share identical partitioning. 2. At the start of planning, the global property defaults to false and is set to true only when both data sources are already repartitioned and partition‑preservation is enabled (the Datadog use case). 3. When a join is created, its property is set from the global‑planning property. 4. When a hash repartition is introduced, the global‑planning property is set to false. 5. During join execution, if the property is true, map dynamic filters by partition index; otherwise, use today’s behavior. I call it global-planning property but it is only a property of a plan so it can have a better name. My assumption is that this would be simple approach and require minimal code changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
