Kevin-Li-2025 opened a new pull request, #23066:
URL: https://github.com/apache/datafusion/pull/23066

   ## Which issue does this PR close?
   
   - Closes #22848.
   
   ## Rationale for this change
   
   External sort merge phases currently select spill files based only on memory 
reservation. With many small spills, a single phase can open enough files to 
exceed the process file-descriptor limit.
   
   ## What changes are included in this PR?
   
   - Add `datafusion.runtime.max_spill_merge_fan_in` (`0` preserves the current 
unlimited behavior).
   - Clamp non-zero values to at least 2 during merge selection so each pass 
makes progress.
   - Support builder configuration and dynamic SQL `SET` / `RESET` / `SHOW`.
   - Add unit, runtime SQL, SQLLogicTest, information schema, and generated 
documentation coverage.
   
   ## Are there any user-facing changes?
   
   Users can cap the number of spill files opened in one external merge pass. 
The default remains unchanged.
   
   ## How was this change tested?
   
   - `cargo test -p datafusion-execution 
test_max_spill_merge_fan_in_builder_and_dynamic_update --lib`
   - `cargo test -p datafusion-physical-plan spill_merge_fan_in --lib`
   - `cargo test -p datafusion --test core_integration 
test_max_spill_merge_fan_in_runtime_config`
   - `cargo test -p datafusion-sqllogictest --test sqllogictests -- 
set_variable.slt`
   - `cargo check -p datafusion`
   - `cargo clippy -p datafusion-execution -p datafusion-physical-plan -p 
datafusion --lib -- -D warnings`
   - `cargo fmt --all -- --check`
   - `dev/update_config_docs.sh`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to