nathanb9 opened a new pull request, #23222: URL: https://github.com/apache/datafusion/pull/23222
## Which issue does this PR close? Closes #23208. ## Rationale for this change Single-column primitive sorts can produce many small sorted runs before spilling, increasing merge fan-in and merge overhead. ## What changes are included in this PR? This enables in-memory run coalescing on the sort spill path for single primitive-column sorts, bounded by `sort_in_place_threshold_bytes`. Multi-column and non-primitive sorts keep the existing behavior. ## Are these changes tested? Existing focused sort tests pass: - `test_in_mem_sort_coalesced_runs` - `test_sort_spill` - `test_sort_spill_utf8_strings` Also ran: - `cargo fmt --all` - `cargo clippy --all-targets --all-features -- -D warnings` - `git diff --check` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
