[Apache TVM Discuss] [Development] [RFC] Standalone Code Generation and C Runtime for STM32 bare-metal devices

Andrew Reusch via Apache TVM Discuss Thu, 01 Apr 2021 08:34:56 -0700


Great, a few final clarifications.


> The Module Library Format seems not fully finalized yet :wink: That’s fine. I 
> will generate the structure as per your RFC proposal (no crt), and we can 
> refine it from there. This is a minor detail.

It is somewhat of a living standard, but it's versioned. If you have tests for 
your implementation, we will run them as we make changes and bump Model Library 
Format version.

One clarification we do need to make here: Model Library Format is generated 
with the function `tvm.micro.export_model_library_format`, and the generated 
directory tree is given as an argument in [Project 
API](https://discuss.tvm.apache.org/t/rfc-tvm-project-api/9449) to 
`generate_project`. I think you should just need to modify your codegen to 
consume Model Library Format rather than also making a generator for it. Sorry 
if that was unclear, and let me know if something seems fundamentally broken 
with that approach.

Right now, Model Library Format includes graph executor configuration and so 
suggests the executor that should be used. I think you can just ignore that 
piece and/or use it to drive your codegen.

With all this said, we just have a 
[PoC](https://github.com/areusch/incubator-tvm/commit/b86d40a66894c08e74c952f42fd600efbe351625)
 of Project API we're developing now. Currently there is just a 
[demo](https://github.com/areusch/incubator-tvm/commit/b86d40a66894c08e74c952f42fd600efbe351625#diff-54b391a3e85f1bac817634088c77742416fbc7c8bf551d823524f80cd577464d)
 of an implementation for the host C runtime. The remaining items before 
committing the PoC are:
- Develop the Zephyr API implementation
- Migrate apps/bundle_deploy to use Project API

I'll try to post the Zephyr implementation as a (loose) example (e.g. the 
Zephyr impl would not do runtime generation nor memory pinning) of what I'm 
thinking for STM32 codegen by end-of-week. Let me know what you think of this 
approach. We could expand the content of Model Library Format, if that was 
necessary for an STM32 implementation. 

The benefit of doing this is that autotuning is going to use Project API to 
drive the build/flash/timing pipeline, so it would be a more natural shift as 
we move towards that. There is one additional detail not yet ironed out: the 
code you would want to generate for autotuning is very different from that 
you'd want to generate for inference. My vision for this was to have two 
different project generators (e.g. `apps/microtvm/stm32/inference` and 
`apps/microtvm/stm32/autotune`). In this proposal, the `inference` project 
would essentially be implemented as you guys have done now, and `autotune` 
would need to include the TVM RPC server and logic to drive the RPC transport 
over e.g. UART, USB, etc.

Let me know what you think of this idea.

>  I propose to start with the STM32 code emitter now and work together with 
> the TIR-based AoT on converging to a common understanding. This will pave the 
> way for us to move to the TIR-based code generator. We can perhaps also 
> contribute to its development.

Great, that sounds good. Let's discuss the API convergence in a follow-on RFC. 
I'm not sure I see exact unification on naming across frameworks, but I agree 
that the structure of our API is a bit divergent from the other embedded AI 
platforms. The API change will affect many, so we'll need to have a focused 
discussion and loop in quite a few others.

@giuseros @ramana-arm, possible to give an update on the AOT progress?

>> When we do tensor pinning, I think it’s likely I’ll propose to add some 
>> tensor_id (note: different from storage_id, as storage_id could contain 
>> multiple tensor_id) to TVMBackendAllocWorkspace, and a lookup table could 
>> just return a pointer into the pre-allocated memory pool. 
>> TVMBackendFreeWorkspace would become a no-op. Will that work for you guys?
> 
> That is good. Just keep in mind that these memory pools should be open to a 
> static allocation as a section via a link script, to a static allocation as a 
> table from the main application (.data), and to the dynamic allocation via 
> whatever allocator the application may choose.

Yeah this is all part of that. In particular, some accelerators may need a 
subset of parameters to live in a memory pool that lives at a fixed address for 
faster loading at startup.

> > * consider removing the need to use PackedFunc looked-up by string name, 
> > and instead provide more natural C wrappers around those functions
>
> Already the case.
>>  - We will add a API method for such lookup implementing the mapping.

Here, my goal is just to implement a simpler code-generation of 
`tir.call_packed` nodes which avoids a string lookup at inference time (e.g. 
avoids calling `TVMBackendGetFuncFromEnv` to do the string-lookup at inference 
time). 

> ### Actions for us:
> 
> Re-submit the PR with this:
> 
> 1. Move to generating Module Library Format (as it is for now).
> 2. Provide the docker and a test application for the sanity CI.
> 3. Move to Project API on the demo side (structure + 
> `microtvm_api_server.py`) implementing the Standalone Demo Project Generator 
> based on your 
> [PoC](https://github.com/areusch/incubator-tvm/commit/b86d40a66894c08e74c952f42fd600efbe351625).
> 
> We continue discussion on the C runtime API, how to involve the AoT people ? 
> We can contribute to the development if necessary.
> 
> Does this work for you ?

Aside from (1), which I think can be generated with 
`tvm.micro.export_model_library_format`, that seems like a great plan to me!

I've tagged the AOT implementers, hopefully they can give a status update here.

-Andrew





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/rfc-standalone-code-generation-and-c-runtime-for-stm32-bare-metal-devices/9562/6)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/cc240131b829a066c6e3cf8e361f3604c1f312431626f6a68b900cc48c1166b0).

[Apache TVM Discuss] [Development] [RFC] Standalone Code Generation and C Runtime for STM32 bare-metal devices

Reply via email to