This RFC outlines a high-level roadmap towards what we might consider a
Standalone of µTVM (Micro TVM). In saying "standalone," we are referring to a
cohesive set of features that will enable a few end-user goals, one of them
being standalone execution of optimized TVM models on-device.
In the coming weeks, I'll be posting RFCs (as they're written) for work that
enables these goals. This RFC is meant to serve as context for those RFCs, as
well as an overall place to discuss high-level goals of µTVM. I'm definitely
interested in everyone's thoughts on this overall direction for µTVM.
## Goals
This roadmap aims to enable these potential end-user goals:
1. Test simple models on supported hardware without writing any microcontroller
code.
Simple models means: models without conditional execution that wholly fit in
the device flash and which can be evaluated without reusing RAM.
A user should be able to view execution time (timed accurately on device), code
size, memory consumption, and model output. It should be feasible to test
performance under different SoC configurations, though this may involve writing
microcontroller firmware or e.g. tweaking RTOS settings.
2. Easily package a tested model (from #1) into a C library with the following
properties:
- BYO memory allocator, plus a provided standard allocator with BYO buffer
- no malloc() calls (outside of calls to the internal memory allocator)
- graph-based runtime, but the graph can be fixed/compiled AOT
- can be configured to use the same supporting library functions as are
used during autotuning/eval. specifically, this means that the same TVMBackend
functions invoked during autotuning are also invoked in production.
3. Easily autotune supported operators without having to write too much TVM
code beyond the model definition. You shouldn't have to understand how TVM
works to try autotuning.
## Projects
We think these projects are the right ones to pursue in order to enable these
goals. More detail for each of these projects will be given in RFCs to follow
this one.
1. **µTVM On-Device RPC Server
([PoC](https://github.com/areusch/incubator-tvm/tree/utvm-runtime))**
[u]Description[/u]: Following the [RPC modularization
PR](https://github.com/apache/incubator-tvm/pull/5484), we propose to port the
[TVM C
Runtime](https://github.com/apache/incubator-tvm/tree/master/src/runtime/crt)
to bare metal targets, and use the
[MinRPC](https://github.com/apache/incubator-tvm/blob/master/src/runtime/rpc/minrpc/minrpc_server.h)
server to implement a (limited) TVM RPC server on-device using any pipe-like
transport (i.e. UART, Ethernet, USB, semihosting, etc). This just implements
the C++ RPCEndpoint on device, not other features implemented behind
PackedFuncs, such as LoadModule, GraphRuntime, etc.
[u]Rationale[/u]: TVM currently encodes device-specific memory layouts in
the repository. In addition to this, TVM also needs to somehow specify the SoC
configuration (i.e. oscillator, caches, power modes, etc) to reliably reproduce
results. In order to scale past a few devices, effectively use flash, and take
advantage of platform efforts such as Zephyr, mBED, Mynewt, and others, TVM
should adopt a more portable µTVM compilation/linking strategy.
2. **µTVM CI in TVM.**
[u]Description[/u]: Write a CI test for the On-Device Runtime against x86
and potentially simulated bare-metal implementations (I.e. qemu or other device
emulators). Run the CI as part of the TVM pre-submit. We don't intend to
include real hardware in the TVM pre-submit. Outside of the pre-submit, we'd
like to encourage use of the CI test to validate implementations of the
on-device runtime on real hardware.
[u]Rationale[/u]: Some CI test is needed to protect against breakages in
the CRT on bare metal. The CI should be executable by all TVM contributors,
since it will be in the pre-submit. The same test as is used in the pre-submit
should be sufficient to validate real hardware.
3. **Enable AutoTVM using the on-device runtime.**
[u]Description[/u]: Modify the AutoTVM build process to create µTVM
On-Device Runtime binaries and flash them as is appropriate for the platform
they're using.
[u]Rationale[/u]: AutoTVM needs to evaluate performance in scenarios that
exactly mimic real-world device configuration.
4. **Place Model Weights in Flash.**
[u]Description[/u]: Modify C codegen to output supplied model weights as
const arrays, possibly with a user-specified section.
[u]Rationale[/u]: Allows for more realistic use of device memory and allows
larger models to fit.
5. **Graph Runtime on bare metal.**
[u]Description[/u]: Make the graph runtime or full-model execution work on
bare metal with a firmware-friendly interface. Without this project, models
still need to be driven end-to-end by a connected TVM "supervisor" instance
containing the GraphRuntime. This change enables firmware engineers to
integrate TVM models into production applications.
[u]Rationale[/u]: Supports goal #2, and allows us a chance to ensure that
the on-device runtime executes graphs in the same way both during AutoTVM and
during production.
6. **Export stats from the on-device runtime.**
[u]Description[/u]: Provide RPC calls for stats like execution time, memory
usage.
[u]Rationale[/u]: Supports goal #1. Allows firmware engineers to better
evaluate TVM model output and collaborate with other engineers/data scientists
involved with model development.
## Proof of Concept
Parts of project #1 work to some degree
[here](https://github.com/areusch/incubator-tvm/tree/utvm-runtime). The
short-term plan is to split this PoC into a couple of pieces, each with its own
RFC, and discuss/merge piece by piece.
## Next Steps
This roadmap is just an initial concept and we'd definitely like to work with
the community to make sure this direction is useful for others. We intend to
drive some of this work from OctoML, but there are a lot of tasks and there are
plenty of ways to get involved.
More immediately we'd love feedback on the overall direction. The On-Device RPC
Server (project #1) underpins most of the rest of the work, so we'd welcome
review on our initial implementation (RFCs and PRs to come soon). Once that
lands in the CI, it should be much easier to collaborate on the rest of this
effort.
We'll also have a [µTVM-focused
meetup](https://discuss.tvm.ai/t/utvm-embedded-focus-online-meetup/6908) on
Wednesday 6/17 9am PDT if you'd like to discuss in a higher-bandwidth setting.
We'll post any followup points for discussion on the forum.
---
[Visit Topic](https://discuss.tvm.ai/t/rfc-tvm-standalone-tvm-roadmap/6987/1)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/email/unsubscribe/975d8334d6494979b2d1e69273293fded8f62e049b06d15ba88102b1664ba3e7).