Thank you @leandron @ekalda for the questions, and @zhiics, @slyubomirsky,
@Hzfengsy, @sunggg for the discussion!
As a long-term contributor since 2018, the pre-Relay era, and the initiator and
top 2 contributors of RAF
([https://github.com/awslabs/raf/](https://github.com/awslabs/raf/)), the
TVM-based training framework, I would love to share my perspective and slight
concern about the TVM development at this moment, 2022.
While being a decent auto-tuner for static shape workloads, and the latest work
with auto tensorization further boosted its performance with microkernel
tuning, there has been strong demand from the community to allow TVM to do
more, which as @YuchenJin listed, includes:
- Unified abstraction
- Dynamic shape support
- Dataflow block and first-class side effect handling
- Non-inference workloads
As a community, we do encourage everyone to understand different perspectives
and empower each other, and I believe this is the way for us to grow.
Technically, just wanted to address a meta question here: why is it less
feasible to gradually upgrade Relay?
- Conflicted design philosophy: Relax follows a completely different design
than Relay with mutually conflicting assumptions and ideas. For example, having
two conflicting shape mechanisms in the system would effectively mean passes
have to handle both of them.
- Engineering challenge: design difference leads to hurdles for incremental
updates. For example, if we want to move away from the assumption that the IR
is side effect-free, all the passes with the old assumption become
automatically invalid or wrong because the assumption is not respected.
- Stability concern: Even if we do surgical incremental enhancement to Relay by
introducing breaking changes piece by piece, there is still stability concern.
Consider a case where there are downstream vendors whose forks depend on
upstream Relay, and Relay’s assumptions break over time, it would be less
stable for them to maintain Relay.
Alternatively, we believe having Relax as a separate pass is a cleaner and more
maintainable approach - gradually bringing some of the passes from the bottom
is engineeringly incremental and guarantees that the Relay code path is always
stable.
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/89#issuecomment-1222788202
You are receiving this because you are subscribed to this thread.
Message ID: