I'm a graduate researcher at UW and have been working as a full-time SDE at AWS AI for years, mostly around Deep Learning Frameworks Libraries. I feel like all of us agree dynamic shapes are essential so I don't want to spend more time emphasizing how important it is. I'm not a contributor to Relax, but I have been following it for a long time. I don't want to pretend to be neutral, I do think it is quite necessary to welcome Relax, rather than just adding dynamic shape support in Relay.
The main controversy in this thread is about whether to upgrade Relay incrementally or develop a new IR called Relax. I understand hardware companies appreciate stability, and we can see CUDA didn't change its interface drastically over the years, what a miracle! There must be several times people wanted to develop new languages/compilers for NVIDIA GPUs but CUDA survived, this is a lesson we should learn: in the beginning, we design things with a vision of the future in mind, then we maintain them with high standard, improve it incrementally and be customer-obsessed. This is the ideal story, but we should not ignore that though CUDA was invented before the DL era, there are already many high-performance computing workloads the designer can refer to. Fortunately, even in 2022, the operators used in DL still highly align with HPC ones and are actually simpler (it's a world of GEMM). What about the story of (computational) graph-level IRs? The dominant workload in DL changes over time and I would say they cause a lot of headaches for framework and compiler designers: first CNNs/RNNs/LSTMs/Tree-LSTMs(the structure dynamism is one of the challenges Relay would like to tackle, but unfortunately they are used nowhere), then we have Transformers/GNNs(not as hot as Transformers because of hardware lottery, but who knows the future). Now we have entered a time where models converge, but scalability grows significantly: models become larger and larger, and a lot of engineers and researchers propose (checkpointing and rematerialization, quantization, graph substitution, fusion and stitching, sparsification and mixture-of-experts, hybrid parallelism) to optimize DL workloads at compile-time, and I'm glad to see many of them are developed upon TVM because TVM's design is always up-to-date and support new workloads quickly, however, Relay's current design cannot take full advantage of these new techniques, and the system has the trend of becoming fragile. Relax is a great opportunity for us to reconsider the graph-level IR design: prune the redundancies and add new functionalities, it's exciting to see we can unify different levels of optimizations together in [TVM Unity](https://github.com/apache/tvm-rfcs/pull/91), once Relax is accepted by the community. Refactor makes things simpler, rather than complex. Whenever we found it's time to make some changes, TVM always embraces new designs. This happens several times in TVM history: Prior to Relay, there is NNVM, which is deprecated and completely replaced with Relay. The previous Tensor-Expression has limited expressiveness, and the schedule tree data structure cannot support tensorization elegantly, then we have TensorIR, which is not only backward compatible, but also brings opportunities for developing new dialects (Ruihang and I designed SparseTIR upon it, works pretty good). The AutoTVM cannot generate scheduling templates automatically, then we have Ansor and Meta-Scheduler. I would emphasize that **the most important part of all these updates are upstreamed within several months**, and do not break any backward compatibility issue, it credits to our hard-working and open-minded contributors and reviewers. Committing to TVM helps these contributors become MLC experts, some of them are PMC members now. I would say non of these refactoring influences TVM's reputation, on the contrary, it makes people impressed by TVM's speed in adapting to the future, and they are more willing to try TVM because it's open, it's driven by innovation. I really don't understand what's the difference this time, when it comes to Relax? We have a bigger community, this is awesome and I definitely welcome your input and constructive suggestions on the future of this project. I view the [New Scoped Module RFC](https://discuss.tvm.apache.org/t/process-rfc-empowering-new-scoped-module-to-the-project/13617) as a contract between industrial developers and researchers/engineers like me that works on "toy prototypes", we promise not to touch anything that might influence user experience, we also don't want to be discouraged because my prototypes cannot be upstreamed and only stay in some random GitHub repo as a toy. I also think the new S0-S1-S2 progress is already the most painless approach to delivering new designs, and the effect is equivalent to *incremental change*. If people take a look at the Relax repo, it already has a huge amount of code there and well-written documentation (you can compare it with the official relay documentation), I think it's super inappropriate to ignore these contributors' devotion, especially individual contributors such as @LeshengJin . TVM has a huge user base of researchers, they are an important part of the community, and they also contribute high-quality code instead of just hacking. Regarding the "lower standard than other communities" issue, TVM has high standards and we are not talking about standards. If no fundamental changes are allowed in DL infrastructures, google should stay at TF 1.0 and never develop JAX, and PyTorch should not create so many different compiler infrastructures (I want to share [this slide](https://chips-compilers-mlsys-22.github.io/assets/slides/PyTorch%20Compilers%20(Compiler%20&%20Chips%20Symposium%202022).pdf) again. It's 5 am in my timezone, I should have some sleep and I'm still recovering from my recent illness. Opinions on my own and I don't speak for any groups/organizations. Best, Zihao -- Reply to this email directly or view it on GitHub: https://github.com/apache/tvm-rfcs/pull/89#issuecomment-1271521751 You are receiving this because you are subscribed to this thread. Message ID: <apache/tvm-rfcs/pull/89/c1271521...@github.com>