Thank you, everyone, for the discussions here. Let us take a step back and look 
at the non-technical parts of the conversation. A lot of our discussions come 
from two goals:

G0: Maintaining a stable evolution solution for some of our common use-cases
G1: Welcome new improvements, land our technical commitment timely, continue to 
reinvent ourselves, and welcome new community members who have new use cases.

Both goals are very important. G0 ties to our ability to continuously support 
our current use cases. G1 is also essential to our viability as a solution, so 
we can grow as a community and stay competitive in a fast-evolving machine 
learning compilation landscape.

Enabling both has always been an important theme of long-living projects. Deep 
learning frameworks are a common reference to refer back to. Usually, they are 
done in roughly three phases:
S0: Introduction of a new feature/component as an optional module.
S1: Evolving the overall solutions to make use of the new component.
S2: Consider deprecation of some of the existing solutions, or evolve the 
solutions for a consolidation point.

Each stage contains a different level of commitment and would normally entail 
different levels of gating criteria as we look at them.

For example, PyTorch introduced TorchFX as an optional module that supports 
graph tracing and export. It had some overlapping capabilities with 
TorchScript. The PyTorch community is collectively evolving some of the 
compilations (TorchDynamo) to make use of FX. As of now, there is not yet an 
announcement of S2 from the community.

Encouragement of S0 and making it easy to do helps us to enable G1. A too high 
barrier here can discourage community contributions and result in mainline 
lacking the latest features and short-living our competition. This is 
especially important given that the land of machine learning compilation still 
remains open, and the ability to timely support symbolic shape and training 
helps bring in users and contributions who would otherwise turn to alternatives.

G0 is equally important here. In many cases, they boil down to making careful 
and informed decisions regarding evolution (S1 and S2). Additionally, making 
sure that at S0 stage, there is a limited disruptive change to the existing 
infrastructure. Importantly, not every module/feature has to go through all 
stages. And in common practices, the decisions in each stage are usually not 
made at the same time.

We can find examples of S0 cases in TVM as well. For example, USMP was 
currently designed for specific cases like AOT. We welcomed these improvements 
to unblock needs in embedded settings early. Through USMP we found the need of 
tir.alloc_const, which related to evolving on existing infra(S1). As a result, 
we had a more in-depth discussion. Additionally, we are bringing the effort to 
further enable USMP in a broader setting as part of S1. At some point, we might 
consider consolidating all memory allocations as S2 – note that many community 
members are collectively working toward that goal, but we are not yet at a 
point to make such a decision. As another example, we enabled cascaders that 
are specifically designed for micro-NPU, which had some domain overlapping with 
the arithmetic affine module, but nevertheless bought in without consolidation 
because we believed that there is enough interest and maintenance support for 
the module. Finally, the unpacked_api was specifically enabled for extremely 
low-resource settings, and we enabled S0 level inclusion despite some 
inconsistency with the packed func API.

Of course, we do not want to enable random things in the codebase, which ties 
back to the maintenance overhead concern. One of the questions we want to ask 
here is whether the module contains enough support from the community that 
allows continued maintenance. Additionally, we should consider the fact of 
added engineering support by welcoming additional community members who are 
interested in the needs and would otherwise look elsewhere.

Our overall thought process and decision time point for each stage can be 
different – they should be so we can enable both G0 and G1. Nor do all modules 
have to go through all the stages. 

For S0, we would expect if there are enough champions in the community with a 
self-contained plan. For important features, we would expect, say, more than 
three committers who can champion the module and significant community support 
to maintain them. Additionally, S0 should be made as minimally disruptive (wrt 
to the current infrastructure) as possible. To encourage G1, we can overlook 
some levels of duplications (just like the TorchFX and TorchScript case, USMP, 
and other allocators when they land as S0), considering the additional 
community support we get to maintain them. 

S1 and S2 would involve more careful discussions and coordination with greater 
amounts of details on some of the key points. Likely, they will also happen at 
a different time point so we can make informed decisions.

This particular RFC is at the S0 stage and intentionally made to be so. As the 
RFC stated, there is no proposal to make S1/S2 decisions at this RFC. Many of 
our current discussions are around S1/S2 – the future evolution of the system. 
They are extremely helpful discussions to have to set up the context and help 
us improve the design, but not necessarily decisions we have to make 
immediately. Let us think about the broader community members we can empower 
and bring in through enabling the S0 improvement.

Thank you, everyone, for the discussions so far, and let us work together to 
enable our community.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/89#issuecomment-1224114184
You are receiving this because you are subscribed to this thread.

Message ID: <apache/tvm-rfcs/pull/89/c1224114...@github.com>

Reply via email to