Re: [PR] perf: simplify HashJoinExec dynamic filter, drop CASE routing [datafusion]

via GitHub Sat, 02 May 2026 13:15:30 -0700


Dandandan commented on PR #21931:
URL: https://github.com/apache/datafusion/pull/21931#issuecomment-4364635439


   > is slower than just probing more hash tables
   
   I am very skeptical of probing partition x hash tables for every row is very 
efficient, but I see it can still be faster than evaluating a long nested 
expression which grows based on number of partitions, as DF doesn't have 
special knowledge of the "routing" (+ if everything matches it will be pure 
overhead).
   
   I feel like doing the `% partition` in the physical expression and routing 
that to the correct expression / probe should be faster + more selective than 
just probing all tables (which as well doesn't scale well to many partitions)?
   
   That said - I like removing the big `CASE`-based routing as it seems hard to 
get that performant.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] perf: simplify HashJoinExec dynamic filter, drop CASE routing [datafusion]

Reply via email to