Thanks for the good discussion. I actually wasn't referring particularly to our conversations in github with respect of the refactors, but it's nice from you to bring them up. And it's ok to disagree in small things, hopefully we can align in the big ones.
I understand that for TVM you might have different constraints with how dynamic you want to be for mutating the graph and doing quick experimentation and research but please try to understand my perspectives coming from a software engineering background and helping maintain MXNet for thousands of users and teams using it in production. Let's also consider how many issues we have open and our bandwidth to deal with additional complexity. To your pros and cons I would like to add and emphasize that currently the heavy use of dynamic attributes in the graph using dmlc::any has two very negative consequences, at least for MXNet: 1 - Makes the data structures using dmlc::any almost impossible to debug, as they are just binary. 2 - Makes the code more difficult to understand because there's no declaration in a data structure of the data fields it uses and its responsibilities. We are basically shoving all kinds of stuff using dmlc::any. 3 - You get no help from the IDE to navigate and refactor as another consequence. I would really like you to give me solutions to these problems or at least acknowledge them and tell me why do we have to pay those tradeoffs instead of just dismissing them as engineering taste. The more I work with MXNet the more I wish debugging was easier, and reading and refactoring the code, and those fields would be declared and typed in their corresponding data structures, for MXNet I don't think this would affect anything in regards the python bindings since they go through the typed C API anyway. Maybe we can get some inspiration from LLVM as they have bindings for many languages to work with the AST and have very clean APIs for the compilation steps. It's also OK to have an initial dynamic codebase for research and experimentation and then "cure" them into a solid maintainable one with more types and more robust... Pedro. On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <[email protected]> wrote: > > A good infrastructure design has a long way to go and has a profound impact > on the project itself. That is why we always want to rethink if the interface > can be better done, and think about the next possible infrastructure to make > things better, Refactoring is certainly part of it. > > There are usually two types of refactoring we refers to : > 1) The major design change, in terms of class relations, data structures > (e.g. numpy support, adding compilation to new hardware) > 2) The specific choice of API, programming style(more types or type-erased > program) > > (1) affects the long term support of the project, introduces new features if > necessary and need a lot of thoughts into that. I believe the general IR, > compilation and numpy support belongs to that category. > > I would particularly like to talk about (2). > Because there is no unified correct answer in software engineering, different > developers may prefer different views on a certain problem. > Some of them have things to do with the taste developers. The change could > favor certain aspect of the project, but not necessarily another part. > Refactoring wrt these sometimes does require a more thoughtful conversation > and make a reasonable compromise. > > For example, we have a recent discussion about whether to introduce more > typing into the code base, to the extent that the base data structure could > be templatized. > - The Pros of this approach > - It introduces more typing and compile-time error message(instead of > runtime checking), could help developers to find problem earlier. > - The Cons of the approach: > - Having a template in the base data structure causes ABI problem(which > code generated by DLL A vs DLL B) and will have potential future issues. > - Template sometimes confuses some developers. > - For serialization, it is hard to anticipate all kinds of classes and it > is easier to have one class(any) that handles polymorphism. > - Because of most frontends(python) are dynamic, it is easier to interface > them with a type-erased API. > > As we can see there are pros and cons of bringing in more typing to the > change, and there is no unified answer. > One good example of a nice infrastructure design trade-off is DLPack > https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h > This is a base data structure adopted by MXNet, Pytorch, Chainer, and many > other frameworks unanimously. > It is a type-erased data structure that erases the data type, and memory > allocator from the data structure and is designed to exchange tensor(coming > from different memory allocators) across DLL boundaries. > As you can see this is a good example of type-erased data structures. > > When we are having this kind of questions. It is important to have a good > conversation. Sometimes we have to make tradeoffs rather than bend > everyone-else to our will. This is what open source is about. > I would also like to give some examples of conversations and how design > decisions are resolved. It comes from the TVM community's recent discussion > about VM design. > I directly paste the github issue conversation here for the sake of > clarity(note that all the conversations are also mirrored to dev@tvm). > The background is that the community want to bring a virtual machine that can > execute dynamic operations more effectively. > > - The initial proposal, made by one of the committers gave a detailed design > based on Stack VM https://github.com/dmlc/tvm/issues/2810 > - As you can see that there are quite some discussions about whether we > want to use a different set of design, in this case, a register-based version. > - The conversation evolves, and while the community members disagree on > some cases, also agrees with each other on the particular tradeoffs. > - After some discussions, the committers bring a tradeoff design that tries > to consolidate the needs of both sides and this is the final solution being > adopted https://github.com/dmlc/tvm/issues/2915 > I would like to particularly highlight the fact that: 1) there are > disagreements in the development process. 2) developers work together to > understand each others' needs and then make consensus on a perhaps better > design. > > There are two other particular conversations between Pedro and myself, which > are during his contributions. > - https://github.com/dmlc/tvm/pull/3037 In this case, I raised the concern > about API consistency, and Pedro brings up a reason why he thinks it is a > better idea, I agreed and we merged the PR > - https://github.com/dmlc/tvm/pull/3108 In this other case, there are > technical reasons for going both sides for the case of MXNet, we have listed > pros/cons about both sides and have a constructive conversation. Eventually, > I decided to not merge the PR after weighing in all the cases. > > I believe both are useful conversations, and while Pedro and I disagree > sometimes, we do agree on many other cases. The most crucial part is about > having a constructive conversation. > To summarize, I do refactoring and making things better is certainly > important to make the project better. And I do believe it is crucial to think > about all the technical consequences and make good tradeoff decisions. > Sometimes the decision may not make every developer mostly happy, but a good > technical compromise could move the project forward and help the community in > general. > > Tianqi > > On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <[email protected]> wrote: >> >> >> >> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy >> <[email protected]>: >> > I think Martin does a very good job explaining why >> >refactoring, >> >reducing developer frustration and internal improvement is a crucial >> >productivity multiplier which includes lower cost to ship features, >> >less >> >bugs and time spent debugging. >> >> There's one aspect that's special for open source projects: if a project >> wants to survive long term, it should make it easy for people to get started >> working on the project. In my experience, refactoring and cleanup play an >> important role in that. So thanks also for making recruiting of new >> contributers better. >> >> Isabel >> -- >> This message was sent with K-9 from a mobile device with swipe to type >> enabled. I'm sorry for any embarrassing typos that slipped through.
