This RFC outlines a high-level roadmap towards what we might consider a 
Standalone of µTVM (Micro TVM). In saying "standalone," we are referring to a 
cohesive set of features that will enable a few end-user goals, one of them 
being standalone execution of optimized TVM models on-device.

In the coming weeks, I'll be posting RFCs (as they're written) for work that 
enables these goals. This RFC is meant to serve as context for those RFCs, as 
well as an overall place to discuss high-level goals of µTVM. I'm definitely 
interested in everyone's thoughts on this overall direction for µTVM.

## Goals

This roadmap aims to enable these potential end-user goals:

1. Test simple models on supported hardware without writing any microcontroller 
code.
Simple models means: models without conditional execution that wholly fit in 
the device flash and which can be evaluated without reusing RAM.
A user should be able to view execution time (timed accurately on device), code 
size, memory consumption, and model output. It should be feasible to test 
performance under different SoC configurations, though this may involve writing 
microcontroller firmware or e.g. tweaking RTOS settings.
2. Easily package a tested model (from #1) into a C library with the following 
properties:
    - BYO memory allocator, plus a provided standard allocator with BYO buffer
    - no malloc() calls (outside of calls to the internal memory allocator)
    - graph-based runtime, but the graph can be fixed/compiled AOT
    - can be configured to use the same supporting library functions as are 
used during autotuning/eval. specifically, this means that the same TVMBackend 
functions invoked during autotuning are also invoked in production.
3. Easily autotune supported operators without having to write too much TVM 
code beyond the model definition. You shouldn't have to understand how TVM 
works to try autotuning.

## Projects

We think these projects are the right ones to pursue in order to enable these 
goals. More detail for each of these projects will be given in RFCs to follow 
this one. 

1. **µTVM On-Device RPC Server 
([PoC](https://github.com/areusch/incubator-tvm/tree/utvm-runtime))**

    [u]Description[/u]: Following the [RPC modularization 
PR](https://github.com/apache/incubator-tvm/pull/5484), we propose to port the 
[TVM C 
Runtime](https://github.com/apache/incubator-tvm/tree/master/src/runtime/crt) 
to bare metal targets, and use the 
[MinRPC](https://github.com/apache/incubator-tvm/blob/master/src/runtime/rpc/minrpc/minrpc_server.h)
 server to implement a (limited) TVM RPC server on-device using any pipe-like 
transport (i.e. UART, Ethernet, USB, semihosting, etc). This just implements 
the C++ RPCEndpoint on device, not other features implemented behind 
PackedFuncs, such as LoadModule, GraphRuntime, etc.

    [u]Rationale[/u]: TVM currently encodes device-specific memory layouts in 
the repository. In addition to this, TVM also needs to somehow specify the SoC 
configuration (i.e. oscillator, caches, power modes, etc) to reliably reproduce 
results. In order to scale past a few devices, effectively use flash, and take 
advantage of platform efforts such as Zephyr, mBED, Mynewt, and others, TVM 
should adopt a more portable µTVM compilation/linking strategy.
2. **µTVM CI in TVM.**

    [u]Description[/u]: Write a CI test for the On-Device Runtime against x86 
and potentially simulated bare-metal implementations (I.e. qemu or other device 
emulators). Run the CI as part of the TVM pre-submit. We don't intend to 
include real hardware in the TVM pre-submit. Outside of the pre-submit, we'd 
like to encourage use of the CI test to validate implementations of the 
on-device runtime on real hardware.

    [u]Rationale[/u]: Some CI test is needed to protect against breakages in 
the CRT on bare metal. The CI should be executable by all TVM contributors, 
since it will be in the pre-submit. The same test as is used in the pre-submit 
should be sufficient to validate real hardware.
3. **Enable AutoTVM using the on-device runtime.** 

    [u]Description[/u]: Modify the AutoTVM build process to create µTVM 
On-Device Runtime binaries and flash them as is appropriate for the platform 
they're using.

    [u]Rationale[/u]: AutoTVM needs to evaluate performance in scenarios that 
exactly mimic real-world device configuration. 
4. **Place Model Weights in Flash.**

    [u]Description[/u]: Modify C codegen to output supplied model weights as 
const arrays, possibly with a user-specified section.

    [u]Rationale[/u]: Allows for more realistic use of device memory and allows 
larger models to fit.
5. **Graph Runtime on bare metal.**

    [u]Description[/u]: Make the graph runtime or full-model execution work on 
bare metal with a firmware-friendly interface. Without this project, models 
still need to be driven end-to-end by a connected TVM "supervisor" instance 
containing the GraphRuntime. This change enables firmware engineers to 
integrate TVM models into production applications.

    [u]Rationale[/u]: Supports goal #2, and allows us a chance to ensure that 
the on-device runtime executes graphs in the same way both during AutoTVM and 
during production.
6. **Export stats from the on-device runtime.**

    [u]Description[/u]: Provide RPC calls for stats like execution time, memory 
usage.

    [u]Rationale[/u]: Supports goal #1. Allows firmware engineers to better 
evaluate TVM model output and collaborate with other engineers/data scientists 
involved with model development.

## Proof of Concept

Parts of project #1 work to some degree 
[here](https://github.com/areusch/incubator-tvm/tree/utvm-runtime). The 
short-term plan is to split this PoC into a couple of pieces, each with its own 
RFC, and discuss/merge piece by piece.

## Next Steps

This roadmap is just an initial concept and we'd definitely like to work with 
the community to make sure this direction is useful for others. We intend to 
drive some of this work from OctoML, but there are a lot of tasks and there are 
plenty of ways to get involved.

More immediately we'd love feedback on the overall direction. The On-Device RPC 
Server (project #1) underpins most of the rest of the work, so we'd welcome 
review on our initial implementation (RFCs and PRs to come soon). Once that 
lands in the CI, it should be much easier to collaborate on the rest of this 
effort.

We'll also have a [µTVM-focused 
meetup](https://discuss.tvm.ai/t/utvm-embedded-focus-online-meetup/6908) on 
Wednesday 6/17 9am PDT if you'd like to discuss in a higher-bandwidth setting. 
We'll post any followup points for discussion on the forum.





---
[Visit Topic](https://discuss.tvm.ai/t/rfc-tvm-standalone-tvm-roadmap/6987/1) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/975d8334d6494979b2d1e69273293fded8f62e049b06d15ba88102b1664ba3e7).

Reply via email to