zhuqi-lucas commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182101292

   Update: found the root cause of Q1/Q3 regression and a fix.
   
   **Root cause**: `SortPreservingMergeExec` uses `spawn_buffered(stream, 1)` — 
only 1 batch prefetched per partition. With SortExec, all data is pre-buffered 
in memory so SPM reads are I/O-free. Without SortExec (our optimization), SPM 
pulls directly from DataSourceExec, hitting Parquet I/O on each poll. The merge 
loop stalls waiting for I/O.
   
   **Fix**: increase SPM buffer from 1 to 32. This lets background tasks 
prefetch more batches, decoupling I/O from the merge computation.
   
   Local results (release, 16 partitions):
   
   | Query | Main (buf=1) | PR (buf=1) | PR (buf=32) |
   |-------|-------------|------------|-------------|
   | Q1 full scan | 110ms | 180ms | **80ms** |
   | Q2 LIMIT 100 | 9ms | 3ms | **3ms** |
   | Q3 SELECT * | 239ms | 305ms | **197ms** |
   | Q4 LIMIT 100 | 35ms | 7ms | **7ms** |
   
   All queries faster than main, zero regression. All tests pass. Pushing now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to