Frameworks and state of the art models are moving more and more toward 
dynamism, where the shapes of tensors in a model are calculated at runtime, 
either from the shapes of inputs or from the values of inputs.

There are a number of efforts underway in TVM to better support dynamic models, 
including the Tensorflow importer and the Relay VM. In order to align the 
various frontends, we'd like to find a unified approach to dynamism in the core 
relay ops.

Two possible approaches are A0, which involves merging dynamic and static ops, 
and A1, which separates them:

A0.  Demonstrated by @lixiaoquan's work on Symbolic Reshape 
([https://github.com/apache/incubator-tvm/pull/5429](https://github.com/apache/incubator-tvm/pull/5429)),
 one approach is to make certain attributes of ops Optional, increase the 
argument count, and add logic to a number of passes to either use the static 
attribute, if it is defined, or to use the new dynamic input. 

A1. Another approach would be to introduce a new dynamic namespace with a set 
of dynamic versions of ops, separate the two versions in passes, and eventually 
transition the dynamic ops to be default.

This RFC seeks to spawn discussion on the various advantages and disadvantages 
of these two approaches.

As a starting point, we see:
A0. 

Pros. 

Operators of the same semantics are used in one place, which avoids potential 
fragmentation

Cons. 

Definition/Understanding/Optimization of those operators are potentially more 
complicated, there might be some passes need to re-work to respect potential 
dynamism

A1.

Pros. 

We have a clear boundary between dynamic and static ops. 

Passes are easier to reason about

Cons.

Operators can be fragmented as time goes

More changes to APIs

Either approach with involve changes to frontend APIs and the Relay IR. To 
limit the impact to runtimes, we'd like to propose to features around dynamic 
shapes:

1) A compile time check to ensure we only run fully static models with the 
graph runtime. This will help prevent opaque memory allocation errors in the 
Graph Runtime

2) A pass that can convert dynamic ops to static ops via a mixture of rules to 
replace certain outputs with constants and constant folding. Many models that 
use dynamic ops may actually be static, such as a model that calculates the 
Shape of a statically-shaped tensor, then uses that calculated shape to run a 
dynamic reshape. This pass would allow dynamic importers, like ONNX and TF, to 
simply export dynamic graphs while getting the performance benefits of static 
relay models with the Graph Runtime

Performance and Optimization is an important consideration for dynamic shapes, 
but is mostly outside the scope of this RFC. Most kernel tuning and compilation 
methods we have in TVM assume static input shapes. As we move forward with more 
and more dynamic operations and models, the question of how we generate 
efficient code for multiple input shapes will become more pressing, so thoughts 
on that are appreciated.

@tqchen @jroesch @jwfromm @yongwww @haichen @kevinthesun





---
[Visit Topic](https://discuss.tvm.ai/t/dynamic-ops-in-relay/6909/1) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/29c863713bc8894a6e1f802104ffa8d3ccee4b636bd745e51281a3f620fa7dcb).

Reply via email to