andygrove opened a new pull request, #1631:
URL: https://github.com/apache/datafusion-ballista/pull/1631

   # Which issue does this PR close?
   
   Closes #.
   
   # Rationale for this change
   
   The current `Benchmarking` section in the contributor development guide is a 
single paragraph that links out to `benchmarks/README.md`. The README is 
comprehensive on setup (TPC-H data generation, docker-compose, Spark 
comparison) but does not address the day-to-day contributor workflow: which 
benchmark to reach for when working on a specific change, how to read the 
metrics each one prints, and how to capture a flame graph when something is 
unexpectedly slow. The standalone `shuffle_bench` binary added in #1600 is also 
not documented anywhere in the user-visible docs.
   
   # What changes are included in this PR?
   
   - New page `docs/source/contributors-guide/benchmarking.md` with sections 
covering:
     - A decision guide for picking between the TPC-H runner, NYC taxi binary, 
the standalone `shuffle_bench`, and the `sort_shuffle` Criterion bench
     - TPC-H input generation with `tpchgen-rs`
     - Running each benchmark, including the Criterion `--save-baseline` / 
`--baseline` flow
     - Reading the shuffle metrics (`input_rows`, `write_time`, `spill_count`, 
`repart_time`)
     - Profiling with `cargo flamegraph` and `samply`
     - A short note on when to add a Criterion bench vs a binary
   - The `Benchmarking` section in `contributors-guide/development.md` now 
points at the new page and keeps the link to the comprehensive 
`benchmarks/README.md`
   - The new page is registered in the `index.rst` toctree under Contributors 
Guide
   
   # Are there any user-facing changes?
   
   Documentation only. No code or API changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to