NGA-TRAN commented on PR #21832:
URL: https://github.com/apache/datafusion/pull/21832#issuecomment-4316619820

   @adriangb @alamb @gene-bordegaray: Let me know if this approach is feasible:
   
   1. Add a property on each join (boolean or integer) plus a global‑planning 
property indicating whether the build and probe sides share identical 
partitioning.
   2. At the start of planning, the global property defaults to false and is 
set to true only when both data sources are already repartitioned and 
partition‑preservation is enabled (the Datadog use case).
   3. When a join is created, its property is set from the global‑planning 
property.
   4. When a hash repartition is introduced, the global‑planning property is 
set to false.
   5. During join execution, if the property is true, map dynamic filters by 
partition index; otherwise, use today’s behavior.
   
   I call it global-planning property but it is only a property of a plan so it 
can have a better name.
   
   My assumption is that this would be simple approach and require minimal code 
changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to