andygrove opened a new pull request, #1623:
URL: https://github.com/apache/datafusion-ballista/pull/1623

   # Which issue does this PR close?
   
   Closes #.
   
   # Rationale for this change
   
   The sort-based shuffle writer was added as an opt-in alternative to the 
hash-based writer. It produces `2 × N` files instead of `N × M` (where `N` is 
input partitions and `M` is output partitions), coalesces small batches before 
writing, and bounds shuffle memory via spill-to-disk. With recent improvements 
landed in #1615 (byte-copy spill files and block-IO transport for remote 
reads), the sort-based writer is the better default for most workloads, 
especially anything with non-trivial partition fan-out where the hash writer's 
`N × M` file count becomes a problem.
   
   This PR makes sort-based shuffle the default. The hash-based writer is still 
available for narrow shuffles where the buffering and merging overhead of the 
sort-based writer is unnecessary.
   
   # What changes are included in this PR?
   
   - Flip the default of `ballista.shuffle.sort_based.enabled` from `false` to 
`true` in `ballista/core/src/config.rs`.
   - Update test expectations:
     - `context_checks.rs` EXPLAIN and EXPLAIN ANALYZE plans now show 
`SortShuffleWriterExec` for hash-partitioned stages (and the additional 
`spill_*` metrics).
     - `execution_graph_dot.rs` dot-graph fixtures show `SortShuffleWriter` for 
hash-partitioned stages; terminal stages with `partitioning: None` continue to 
use `ShuffleWriter`, since the planner only switches to sort-based when 
partitioning is `Hash(...)`.
     - `planner.rs` `distributed_window_plan` downcasts to 
`SortShuffleWriterExec`.
     - `sort_shuffle.rs` `create_hash_shuffle_context` now sets the flag 
explicitly to `false` since it can no longer rely on the default.
   - Update `docs/source/user-guide/tuning-guide.md` and 
`docs/source/user-guide/configs.md` to mark sort-based shuffle as the default 
and document how to opt in to the hash-based writer.
   
   # Are there any user-facing changes?
   
   Yes. Existing clusters that do not set `ballista.shuffle.sort_based.enabled` 
will switch from the hash-based writer to the sort-based writer on upgrade. The 
sort-based writer has different memory and disk-IO characteristics: it buffers 
per-output-partition batches in memory up to 
`ballista.shuffle.sort_based.memory_limit` (default 256 MiB) and spills to disk 
past `spill_threshold` (default 80%). The hash-based writer remains available 
by setting `ballista.shuffle.sort_based.enabled=false`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to