We have currently built the infra for Bring-Your-Own-Codegen. For demonstration
purpose, a simple CSourceModule style codegen and runtime is used for ccompiler
and dnnl (now called oneDNN). CSourceModule runtime works reasonably well on
small examples and it is easy to understand. However, it also poses quite a few
challenges on development and deployment of relatively large models or models
with relatively large inputs.
- The serialization is quite cumbersome as it normally works on per operator
and emits a wrapper to invoke the library.
- Handling last constants is difficult. We currently either have to introduce
countless assignments or allocate a large chunk of memory on the static
segment. These approaches may significantly increase the compilation time.
- For certain backends, like TRT and dnnl, CSourceModule complicates the use of
or even makes it impossible to use their execution engine.
This RFC proposes a JSON runtime associated with a JSON serializer for BYOC
which effectively solves the above problems. In addition, this type of runtime
is more familiar to the community as the graph runtime is more or less in this
style and we have already implemented a minimal example runtime. This RFC
extends the minimal example and makes it more general to all backends with
execution engine.
- JSON nodes and code generator/serializer
- Data structures to represent the nodes and entries in a json runtime.
The serializer converts a Relay program into JSON format.
```c++
class JSONGraphNodeEntry {};
class JSONGraphNode {};
SOE // Serialize a Relay program into JSON frormat, graph and params
// should be saved in the same artifact
class JSONSerializer : public ExprVisitor {};
- JSONRuntimeDriver
- Deserialize the artifact and manage the initialization and invocation
of the runtime.
- Cache the engine when loading the library
```c++
JSONRuntimeDriver : public ModuleNode {
void Deserialize(); // Deserialize the artifact and engines
PackedFunc GetFunctioin(); // Invoke a subgraph using symbol
static Module LoadFromBinary(); // Load the JSON binary
void SaveToBinary(); // Save the module
```
- JSONRuntimeBase
- The base for handling a graph. It will be extended by the concrete
backends, like TRT, dnnl, and other accelerators.
```c++
class JSONRuntimeBase : public ModuleNode {
virtual void Run() = 0; // Invoke an engine
virtual void Init() = 0; // Build an engine
// Utilities to save and load a json graph.
};
```
- Open questions
- Symbolic representation of op attribute, i.e. `Expr start` and `Expr
end` in the `arange` op. Normally, we should not offload this type of nodes to
accelerators, but how can we serialize them if we want to support as some of
them may not be data-dependent?
- It's intuitive for BYOC to be used along with uTVM. How this JSON
runtime will be connected with other runtimes like utvm?
@tqchen @thierry @matt-arm @masahi @comaniac @manupa-arm @jonso @ramana-arm
---
[Visit
Topic](https://discuss.tvm.ai/t/byoc-runtime-json-runtime-for-byoc/6579/1) to
respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/email/unsubscribe/ca373fe6d23cd9ce1e0e52e7af83e0da8ecb3735cb0113f818b2be05c1e6e37d).