## Motivation
Currently, TVM lacks an up-to-date and reproducible benchmark. The only 
benchmark is hosted at 
[tvm/apps/benchmark](https://github.com/apache/incubator-tvm/tree/main/apps/benchmark).
 However, this benchmark is too old and has several flaws.
1. The results were obtained 2 years ago.
2. The deep learning models are old. It does not include new models (e.g., 
BERT, EfficientNet)
3. The input format is TVM's internal relay format. It does not use formats 
from high-level frameworks (e.g., pytorch, mxnet) or open exchange format 
(e.g., ONNX).
4. It does not cover Intel CPUs.
5. It only provides pre-tuned configurations by 
[tophub](https://github.com/tlc-pack/tophub), but does not provide the scripts 
to generate these configurations.

This RFC aims at building a new open, reproducible bechmark for TVM. When the 
new benchmark is ready, we can run evaluation nightly and run auto-tuning 
weekly or monthly.

## Approach
As the first step, we target three models, three hardware platforms and four 
code generation strategies.
To make the comparision with other frameworks easier, we choose ONNX as the 
input model format.

- models: resnet-50, mobilenet v2 and BERT with batch size 1
- hardware platforms: NVIDIA GPU, Intel CPU, ARM CPU 
- code generation strategies: autotvm, auto-scheduler, tvm + manual library, 
ONNX-runtime.

All logs generated during the auto-tuning should be uploaded for future 
references. 

I created one a [tlc-bench](https://github.com/tlc-pack/tlc-bench) repo and 
opened a [roadmap](https://github.com/tlc-pack/tlc-bench/issues/1#roadmap). I 
am seeking for contributors who are interested in helping me.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/rfc-building-a-new-reproducible-benchmark-for-tvm/8496/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/4517a303480de9ef009e887e58c551b59e87636fbca7969877b17fd4f32118c3).

Reply via email to