zhuqi-lucas commented on issue #21381: URL: https://github.com/apache/datafusion/issues/21381#issuecomment-4190427355
Thanks for the context! I think there might be a slight misunderstanding — I've updated the issue description to clarify. This is specifically about the case **after SortExec is eliminated** via sort pushdown (#21182). When sort elimination removes SortExec entirely, there is no Stage 1 merge. SPM becomes the only merge operator, reading directly from I/O-bound DataSourceExec partitions. In that scenario, a parallel merge within SPM itself (splitting N input streams into groups and merging in parallel) could help reduce I/O stalls. The DuckDB parallel k-way merge link is great — thanks for sharing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
