alamb opened a new issue, #21937:
URL: https://github.com/apache/datafusion/issues/21937

   ### Is your feature request related to a problem or challenge?
   
   - part of https://github.com/apache/datafusion/issues/21706
   
   @Omega359 added a sweet new benchmark runner in 
   -  https://github.com/apache/datafusion/pull/21707 
   
   You run it like
   ```
   BENCH_NAME=tpch cargo bench --bench sql
   ```
   
   This is useful as it produces output using criterion which integrates into 
standard rust tooling.
   
   
   However, using criterion is a pain as it insists on running multiple 
iterations and is harder (for me) to run individual commands
   
   ### Describe the solution you'd like
   
   I would also like to be able to run the benchmark scripts directly without 
having to use criterion, especially while
   iterating during performance work.
   
   
   I think we could / should take inspiration from the duckdb runner: 
https://github.com/duckdb/duckdb/tree/main/benchmark
   
   
   ### Describe alternatives you've considered
   
   I would love to get a standalone binary (that uses the same code as the 
criterion runner, but has a different way to invoke it)
   
   For example, something like a `benchmark_runner` binary
   ```
   # list all benchmarks
   cargo run --profile=profiling --bin benchmark_runner -- list
   ```
   
   Run the bencharks for the tpch queries
   ```
   cargo run --profile=profiling --bin benchmark_runner --  run "tpch"
   ```
   
   We can start simple, but eventually I would love to have similar parameters 
like the duckdb runner 
https://github.com/duckdb/duckdb/blob/793b055a4f9d5194e339c670733b660be0123901/benchmark/benchmark_runner.cpp#L204-L223
   
   Some examples for consideration
   
   ```
   --target-partitions (e.g.  n threads)
   --memory-limit
   --info
   --query
   ```
   
   ### Additional context
   
   If I were implementing this I would do it in a few PRs:
   1. Adds benchmark_runner binary and the `--list` command
   2. Wire up the running logic
   3. Add in environemnt variables, etc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to