Re: [PR] Share per-chunk JoinLeftData across right partitions in NLJ memory-limited fallback [datafusion]

via GitHub Mon, 22 Jun 2026 09:51:52 -0700


viirya commented on PR #22038:
URL: https://github.com/apache/datafusion/pull/22038#issuecomment-4770746715


   @kosiew Thanks for the careful review — both points were spot on.
   
   **Multi-chunk coverage.** You're right that the existing SLT cases only 
exercised the single-chunk path: `generate_series(1, 5000)` arrives as one 
8192-row batch, so `load_one_chunk` accepts the single over-budget batch and 
the coordinator never loads a second chunk. The `spill_count` there comes from 
the right side, not from left chunking. I couldn't reliably target multi-chunk 
from SLT (a memory limit tight enough to split the left side OOMs other 
operators first), so I added Rust regression tests instead: 
`test_nlj_memory_limited_multi_partition_multi_chunk_{left,full}_join`. The 
left input is one row per batch under a tight memory limit, which I verified 
loads multiple distinct chunks with `spill_count > 0`. They assert exact row 
sets, so a regression in per-chunk `JoinLeftData` sharing (duplicate unmatched 
rows) or in the cross-chunk global right-unmatched accumulation (FULL) would 
fail.
   
   One thing worth noting: these tests collect the four output partitions 
**concurrently**. Sequential collection deadlocks on the multi-chunk path — a 
chunk isn't released until every partition finishes probing it (the 
`probe_threads_counter` reaching zero), and no partition can advance to the 
next chunk until the current one is released. So if only one partition is 
driven at a time, it blocks waiting for a release that can never happen. 
Concurrent collection mirrors how partitions actually run under the runtime; I 
documented this in the helper.
   
   **Leader-only stream init.** Good catch — moved the lazy 
stream/schema/reservation initialization into the leader-claim branch (under 
the `loader_in_flight` guard) so waiters no longer open a throwaway spill 
stream that the leader overwrites.
   
   Pushed both as separate commits. Ready for another look when you have a 
chance.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Share per-chunk JoinLeftData across right partitions in NLJ memory-limited fallback [datafusion]

Reply via email to