@Hzfengsy > Note that the reviews of each PR are brought to their own context, and we > anticipate grounded constructive actionable comments, such as adding test > coverage here/there. The module establishment part can be into subjective > discussions that are less grounded, and community empowerment becomes more > critical in such cases.
Taking a step back here: why do S0 modules need to live in `main`? All of our development actually happens on forks (i.e. you fork the repo on GitHub)--the question is: how much development goes on outside of `main` before we merge? Forks could technically live for years outside of `main` (as Relax has). So what are the benefits of merging: B0. Joins development of the S0 module to TVM's S1 and S2 modules, distributing the need to rebase and handle merge conflicts throughout the TVM community. B1. Solicits a wider audience to interact with the module by publicizing it as part of TVM's tree. B2. Fosters further development of the module inside of TVM's core forums. This also helps some community members to better engage by potentially removing hurdles (i.e. engineers at industry companies may be restricted by company policy to engaging with only specified OSS communities). B3. Grants access to TVM's CI resources and runs tests in the mainline regression suite. B4. Permits development of tighter integrations with the remainder of the project by removing merge conflicts. There are also some drawbacks: D0. Increases the load on TVM's CI. D1. Increases the scope of TVM's codebase, potentially making it harder to maintain. For example, changes to core components such as the test infrastructure, the runtime, the object system may require sweeping knowledge of the project. As the number of components in the project increases, the number of engineers with that knowledge decreases relative to the size of the codebase. If that number drops to 0, the codebase will rot. D2. Could contribute to effectively splitting the community by creating multiple arenas for development. While there are benefits to this in terms of exploring new techniques, allowing this as a permanent state means that effectively TVM is many projects with sub-communities rather than one. This may be a reasonable thing to strive for, but for D1 reason it is not clear that those projects should be in a monorepo. A theme across these is that some aspects are community-related (i.e. publicizing a flow, consolidating discussions, allowing us to explore multiple avenues to solving a problem) and some of these are essentially technical/logistical problems (i.e. maintaining a CI, maintaining the core infrastructure, etc). Weighing the community aspects of these benefits and drawbacks, I believe that the benefits are worth the drawbacks and I am in favor of allowing new contributions and approaches in TVM. However, I feel we are still not really addressing the technical side of things: there is no such thing as a truly S0 module, because we state it is > No disruptive change to the rest of the codebase The problem is that because we are in a monorepo, there are technical/logistical aspects to integrating a module where "a failure in one module is a failure for all of TVM," the most obvious example being CI: - A flaky test is checked in; someone is trying to make a change to an S1 module, but the CI fails sporadically due to the flaky test to impede progress. - Someone is trying to improve the way we shard tests with pytest; in the S1 modules, we have e.g. properly parameterized the tests, but in some S0 module, the parameterization was not properly followed. Now that engineer needs to go and learn about how the S0 module works, fix its tests, and find a reviewer for those improvements before they can proceed. - Someone adds a new CI image to test the S1 modules on a new hardware backend. The S0 module is excluded from that backend. Now a user comes along and tries to use the S0 module with the backend, and files a bug. Can the owner of the hardware backend truly claim they "support TVM" if they don't fix the S0 module bug? What mechanism have we put in place to ensure that users approaching TVM understand that they are using an S0 module? - Someone is modifying a core TVM API; an S0 module uses it in a novel way. Now the modifier has to go and understand how the S0 module works so that they can modify the API in a way that doesn't break it. While i agree with you that > The module establishment part can be into subjective discussions that are > less grounded, and community empowerment becomes more critical in such cases. I am concerned that the situations I elaborated above could similarly lead to people leaving the community if it becomes too fragmented. I also feel a bit like the perspective of maintaining the core APIs and test infrastructure is not adequately represented in the discourse so far, and may not be represented in the proposed voting mechanism. There is a lot of detail here about the establishment of an S0 module, down to even introducing a new voting process. When it comes to transitions between those states, there is really not much detail: > - S0 -> deprecation: When a S0 module no longer has an active set of > maintainers, the module will be deprecated. The removal of the module is easy > as they are contained in the respective folders, with no modules that come > and depend on them. > - S0 -> S1: When developers propose to incorporate a S0 module broadly into > existing flows/components. Each such composition would require its own RFC > and following the existing RFC process. For instance, suppose that yes there is a maintainer for an S0 module, but tests keep breaking or they are relatively unresponsive, dragging down any of the core maintenance tasks above? What is the bar after which it's reasonable for a core maintainer to propose deprecation? Proposing deprecation is a lot more unwelcoming than requesting clarifications to new code or saying "no for now," and is a lot more challenging because it might involve finding a new home for code that is already being used by some members of the TVM community, but for which we are finding it hard to keep maintained. I am not trying to be unwelcoming to new code here. But these considerations can often be caught at the RFC level, when questions about test strategy, modularization, and documentation can be discussed at a level that might be accessible to a broader section of the community. Developing the same level of understanding through reading PRs both takes considerably more time and energy than reading RFCs, and may come at a time when a module is half-landed. For this reason, it is not clear to me why we should relax the voting consensus requirements on S0 proposals only to land in a half-baked state that may deadlock on a committer. I would summarize my concern here in two areas: 1. There isn't a clear boundary in TVM between what is "S1" and what is "S0" right now. It may not be clear to core maintainers where their responsibilities end and the S0 maintainers' responsibilities begin. Modularizing TVM could help here, but we haven't done that yet. 2. Already today, "supported by TVM" is quite ambiguous--which flow, on which hardware, with which model? S1 modules here are simply those that are "not S0." How should a user know what these are? Allowing S0 modules is good, but there is currently no distinction in how they're represented to users. I'm concerned that this proposal could use more detail given that we are a monorepo with the problems I mentioned above. I don't know these concerns necessarily need to get addressed in this specific proposal (e.g. more clearly delineating the S1 modules), but I also don't feel that this proposal has adequate detail right now given the state of the project. -- Reply to this email directly or view it on GitHub: https://github.com/apache/tvm-rfcs/pull/95#issuecomment-1286821762 You are receiving this because you are subscribed to this thread. Message ID: <apache/tvm-rfcs/pull/95/c1286821...@github.com>