To put a recent specific example and focus the discussion (there are many as there are attributes), the shapes in the graph are a vector of Shape set as an attribute using dmlc::any so this makes it very difficult to debug the shapes when you have a graph object. I would have it as a typed attribute to the graph, as we always need the vector of shapes and operate on it while doing shape inference.
On Tue, Jun 11, 2019 at 10:18 AM Pedro Larroy <[email protected]> wrote: > > Thanks for the good discussion. > > I actually wasn't referring particularly to our conversations in > github with respect of the refactors, but it's nice from you to bring > them up. And it's ok to disagree in small things, hopefully we can > align in the big ones. > > I understand that for TVM you might have different constraints with > how dynamic you want to be for mutating the graph and doing quick > experimentation and research but please try to understand my > perspectives coming from a software engineering background and helping > maintain MXNet for thousands of users and teams using it in > production. Let's also consider how many issues we have open and our > bandwidth to deal with additional complexity. > > To your pros and cons I would like to add and emphasize that currently > the heavy use of dynamic attributes in the graph using dmlc::any has > two very negative consequences, at least for MXNet: > > 1 - Makes the data structures using dmlc::any almost impossible to > debug, as they are just binary. > 2 - Makes the code more difficult to understand because there's no > declaration in a data structure of the data fields it uses and its > responsibilities. We are basically shoving all kinds of stuff using > dmlc::any. > 3 - You get no help from the IDE to navigate and refactor as another > consequence. > > I would really like you to give me solutions to these problems or at > least acknowledge them and tell me why do we have to pay those > tradeoffs instead of just dismissing them as engineering taste. > > The more I work with MXNet the more I wish debugging was easier, and > reading and refactoring the code, and those fields would be declared > and typed in their corresponding data structures, for MXNet I don't > think this would affect anything in regards the python bindings since > they go through the typed C API anyway. > > Maybe we can get some inspiration from LLVM as they have bindings for > many languages to work with the AST and have very clean APIs for the > compilation steps. It's also OK to have an initial dynamic codebase > for research and experimentation and then "cure" them into a solid > maintainable one with more types and more robust... > > Pedro. > > > > > > On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <[email protected]> wrote: > > > > A good infrastructure design has a long way to go and has a profound impact > > on the project itself. That is why we always want to rethink if the > > interface can be better done, and think about the next possible > > infrastructure to make things better, Refactoring is certainly part of it. > > > > There are usually two types of refactoring we refers to : > > 1) The major design change, in terms of class relations, data structures > > (e.g. numpy support, adding compilation to new hardware) > > 2) The specific choice of API, programming style(more types or type-erased > > program) > > > > (1) affects the long term support of the project, introduces new features > > if necessary and need a lot of thoughts into that. I believe the general > > IR, compilation and numpy support belongs to that category. > > > > I would particularly like to talk about (2). > > Because there is no unified correct answer in software engineering, > > different developers may prefer different views on a certain problem. > > Some of them have things to do with the taste developers. The change could > > favor certain aspect of the project, but not necessarily another part. > > Refactoring wrt these sometimes does require a more thoughtful conversation > > and make a reasonable compromise. > > > > For example, we have a recent discussion about whether to introduce more > > typing into the code base, to the extent that the base data structure could > > be templatized. > > - The Pros of this approach > > - It introduces more typing and compile-time error message(instead of > > runtime checking), could help developers to find problem earlier. > > - The Cons of the approach: > > - Having a template in the base data structure causes ABI problem(which > > code generated by DLL A vs DLL B) and will have potential future issues. > > - Template sometimes confuses some developers. > > - For serialization, it is hard to anticipate all kinds of classes and > > it is easier to have one class(any) that handles polymorphism. > > - Because of most frontends(python) are dynamic, it is easier to > > interface them with a type-erased API. > > > > As we can see there are pros and cons of bringing in more typing to the > > change, and there is no unified answer. > > One good example of a nice infrastructure design trade-off is DLPack > > https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h > > This is a base data structure adopted by MXNet, Pytorch, Chainer, and many > > other frameworks unanimously. > > It is a type-erased data structure that erases the data type, and memory > > allocator from the data structure and is designed to exchange tensor(coming > > from different memory allocators) across DLL boundaries. > > As you can see this is a good example of type-erased data structures. > > > > When we are having this kind of questions. It is important to have a good > > conversation. Sometimes we have to make tradeoffs rather than bend > > everyone-else to our will. This is what open source is about. > > I would also like to give some examples of conversations and how design > > decisions are resolved. It comes from the TVM community's recent discussion > > about VM design. > > I directly paste the github issue conversation here for the sake of > > clarity(note that all the conversations are also mirrored to dev@tvm). > > The background is that the community want to bring a virtual machine that > > can execute dynamic operations more effectively. > > > > - The initial proposal, made by one of the committers gave a detailed > > design based on Stack VM https://github.com/dmlc/tvm/issues/2810 > > - As you can see that there are quite some discussions about whether we > > want to use a different set of design, in this case, a register-based > > version. > > - The conversation evolves, and while the community members disagree on > > some cases, also agrees with each other on the particular tradeoffs. > > - After some discussions, the committers bring a tradeoff design that tries > > to consolidate the needs of both sides and this is the final solution being > > adopted https://github.com/dmlc/tvm/issues/2915 > > I would like to particularly highlight the fact that: 1) there are > > disagreements in the development process. 2) developers work together to > > understand each others' needs and then make consensus on a perhaps better > > design. > > > > There are two other particular conversations between Pedro and myself, > > which are during his contributions. > > - https://github.com/dmlc/tvm/pull/3037 In this case, I raised the concern > > about API consistency, and Pedro brings up a reason why he thinks it is a > > better idea, I agreed and we merged the PR > > - https://github.com/dmlc/tvm/pull/3108 In this other case, there are > > technical reasons for going both sides for the case of MXNet, we have > > listed pros/cons about both sides and have a constructive conversation. > > Eventually, I decided to not merge the PR after weighing in all the cases. > > > > I believe both are useful conversations, and while Pedro and I disagree > > sometimes, we do agree on many other cases. The most crucial part is about > > having a constructive conversation. > > To summarize, I do refactoring and making things better is certainly > > important to make the project better. And I do believe it is crucial to > > think about all the technical consequences and make good tradeoff decisions. > > Sometimes the decision may not make every developer mostly happy, but a > > good technical compromise could move the project forward and help the > > community in general. > > > > Tianqi > > > > On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <[email protected]> > > wrote: > >> > >> > >> > >> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy > >> <[email protected]>: > >> > I think Martin does a very good job explaining why > >> >refactoring, > >> >reducing developer frustration and internal improvement is a crucial > >> >productivity multiplier which includes lower cost to ship features, > >> >less > >> >bugs and time spent debugging. > >> > >> There's one aspect that's special for open source projects: if a project > >> wants to survive long term, it should make it easy for people to get > >> started working on the project. In my experience, refactoring and cleanup > >> play an important role in that. So thanks also for making recruiting of > >> new contributers better. > >> > >> Isabel > >> -- > >> This message was sent with K-9 from a mobile device with swipe to type > >> enabled. I'm sorry for any embarrassing typos that slipped through.
