nathanb9 opened a new pull request, #23222:
URL: https://github.com/apache/datafusion/pull/23222

   ## Which issue does this PR close?
   
   Closes #23208.
   
   ## Rationale for this change
   
   Single-column primitive sorts can produce many small sorted runs before 
spilling, increasing merge fan-in and merge overhead.
   
   ## What changes are included in this PR?
   
   This enables in-memory run coalescing on the sort spill path for single 
primitive-column sorts, bounded by `sort_in_place_threshold_bytes`. 
Multi-column and non-primitive sorts keep the existing behavior.
   
   ## Are these changes tested?
   
   Existing focused sort tests pass:
   
   - `test_in_mem_sort_coalesced_runs`
   - `test_sort_spill`
   - `test_sort_spill_utf8_strings`
   
   Also ran:
   
   - `cargo fmt --all`
   - `cargo clippy --all-targets --all-features -- -D warnings`
   - `git diff --check`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to