pantShrey opened a new pull request, #22230:
URL: https://github.com/apache/datafusion/pull/22230

   ## Note: This PR depends on #21882 (pluggable SpillFile trait) and cannot be 
merged before it. Opening in parallel per @alamb's suggestion for easier 
review. The required SpillFile trait used here is defined in that base PR.To 
review locally, apply #21882 first and then stack this branch on top.
   
   
   ## Which issue does this PR close?
   
   - Depends on  #21882, both prs together closes #21215 
   
   ## Rationale for this change
   
   `materializing_stream.rs` and `bitwise_stream.rs` were reading spilled 
batches via `open_sync_reader` / direct `File::open` calls, bypassing the 
`SpillFile` abstraction introduced in #21882. This PR migrates both to use 
`SpillManager::read_spill_as_stream`, allowing custom backends (Postgres 
BufFile, object storage) to handle spill reads without requiring an OS file 
path.
   
   ## What changes are included in this PR?
   
   - `materializing_stream.rs`: Eagerly restores spilled `BufferedBatches` via 
async streams before freezing, avoiding new state machine variants.
   - `bitwise_stream.rs`: Replaces sync reads with an async `poll_next_unpin` 
loop, caching the stream to survive `Poll::Pending`.
   - `spill_file.rs`: Removes `open_sync_reader` from the `SpillFile` trait (no 
longer needed).
   
   ## Are these changes tested?
   
   Covered by existing SMJ tests. No new tests added, the behavioral change is 
internal (sync → async IO path), observable only through custom backends which 
are not yet in tree.
   
   ## Are there any user-facing changes?
   
   No. Removes `open_sync_reader` from the SpillFile trait, this is a breaking 
API change for anyone implementing the trait, but the trait was introduced in 
#21882 which has not merged yet so there are no external implementors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to