zhuqi-lucas commented on code in PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#discussion_r3035087740
##########
datafusion/physical-plan/src/sorts/sort_preserving_merge.rs:
##########
@@ -366,7 +366,7 @@ impl ExecutionPlan for SortPreservingMergeExec {
.map(|partition| {
let stream =
self.input.execute(partition,
Arc::clone(&context))?;
- Ok(spawn_buffered(stream, 1))
+ Ok(spawn_buffered(stream, 16))
Review Comment:
Thanks @alamb for this concern, this change is directly related to this PR
due to performance consideration:
When PushdownSort eliminates SortExec on each partition (Exact path), SPM
loses SortExec's in-memory buffer — previously SPM read from SortExec's fully
buffered output (zero I/O wait), now it reads directly from DataSourceExec I/O
streams. With buffer=1, SPM stalls waiting on I/O between every batch.
Increasing to 16 allows pipelining I/O prefetch with merge computation,
recovering the throughput lost by eliminating SortExec.
Benchmarked with TPC-DS: buf=16 has no regressions (buf=32 caused tpcds
query regressions due to memory overhead).
But i also concern about it because it will affect all cases, i am also
investigating the best solution for the performance,.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]