Re: Does internal quality matters to users?

Tianqi Chen Tue, 11 Jun 2019 14:41:08 -0700

We have thought very carefully when introducing type-erasures, including
considering the concerns you raised, and never-the-less have made the
decision
that resulted in the current design, which strikes the balance of
type-erasure and typing.
The original intention of the current design is to strike a balance between
the need for typing and the dynamism.
Type erasure brings the benefit of pluggable attributes(you don't have to
enumerate them beforehand), and as a result the pluggable operator and pass
optimization system.
The fact that it is map<str, any> where any can be vector<T> represents the
need to use typed data structures vector<T> when necessary, but only have
to fetch once.
Note that we did not go as far to vector<any>, so that most part of
processing can be typed.


The advantage of map<str, any> enables the unified processing of the
designs. You can find the same design in cases like DataFrame.

To inspect a dynamic type, likely you can create an auxiliary function to
do so, and many of them are still one-liner. or just use ```DLOG(INFO) <<
PrintInfo(graph, fields)```

On the other hand, having a separate attribute field means we need separate
logic for serialization and manipulation of these data structures.
The general benefit of type-erasure is to gain the dynamism and possible
ability of backend registration.
Why I totally understand where you are coming from. We do need to again
look at both sides of the spectrum. The data structure design like any
infra design is about trade-offs.

To sum up, I certainly agree some level of strong typing can be helpful and
we did do that in our codebase.
I do think it is wrong to dismiss dynamic types as simply bad and "not for
production".

Dynamic types, when used properly, can simplify the code assumptions, make
the system more extensible/pluggable, and interpolable with frontends.
All these features are important and necessary for a successful deep
learning system.
Admittedly they might make life a bit harder(e.g. not being able to
auto-complete using your IDE), but again this is a tradeoff we need to make.

I would recommend taking a look at the example of DLPack again
https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h as a
good example
of how such balance can be achieved.

Tianqi


On Tue, Jun 11, 2019 at 1:26 PM Pedro Larroy <[email protected]>
wrote:

> Another data point. While working with a contributor, he/she is asking
> to get access to the graph and values of the NDArray (me too) to be
> able to reason more effectively about an enhacements to the operators:
>
> https://github.com/apache/incubator-mxnet/pull/15120
>
> I think gathering in the wiki design constraints with respect to
> different activities and design proposals using the graph and pain
> points as the one we are discussing would be a constructive way to
> move forward, unless you think everything is as good as it gets right
> now which is what I understand from your responses.
>
> Re the shape of vector I know is one line of code to get it, but you
> can't get the values with a debugger from other points of the code,
> and that compounds to many other dynamic attributes that you can also
> fetch, it's a bit like dying from a thousand cuts. In the MXNet
> codebase we are paying a price for no benefit on flexibility as we
> always use those attributes, hence my point they should be typed in
> the graph. Please try to understand my point and help provide
> solutions or at least a reason why it can't be improved instead of
> saying there's no problem, there's still three problems in order of
> importance: debuggability, clarity of definition of data structures
> and navigability with an IDE which breaks with an untyped field keyed
> with string.
>
> Pedro.
>
>
>
> Pedro.
>
> On Tue, Jun 11, 2019 at 11:07 AM Pedro Larroy
> <[email protected]> wrote:
> >
> > To put a recent specific example and focus the discussion (there are
> > many as there are attributes), the shapes in the graph are a vector of
> > Shape set as an attribute using dmlc::any so this makes it very
> > difficult to debug the shapes when you have a graph object. I would
> > have it as a typed attribute to the graph, as we always need the
> > vector of shapes and operate on it while doing shape inference.
> >
> > On Tue, Jun 11, 2019 at 10:18 AM Pedro Larroy
> > <[email protected]> wrote:
> > >
> > > Thanks for the good discussion.
> > >
> > > I actually wasn't referring particularly to our conversations in
> > > github with respect of the refactors, but it's nice from you to bring
> > > them up. And it's ok to disagree in small things, hopefully we can
> > > align in the big ones.
> > >
> > > I understand that for TVM you might have different constraints with
> > > how dynamic you want to be for mutating the graph and doing quick
> > > experimentation and research but please try to understand my
> > > perspectives coming from a software engineering background and helping
> > > maintain MXNet for thousands of users and teams using it in
> > > production. Let's also consider how many issues we have open and our
> > > bandwidth to deal with additional complexity.
> > >
> > > To your pros and cons I would like to add and emphasize that currently
> > > the heavy use of dynamic attributes in the graph using dmlc::any has
> > > two very negative consequences, at least for MXNet:
> > >
> > > 1 - Makes the data structures using dmlc::any almost impossible to
> > > debug, as they are just binary.
> > > 2 - Makes the code more difficult to understand because there's no
> > > declaration in a data structure of the data fields it uses and its
> > > responsibilities. We are basically shoving all kinds of stuff using
> > > dmlc::any.
> > > 3 - You get no help from the IDE to navigate and refactor as another
> > > consequence.
> > >
> > > I would really like you to give me solutions to these problems or at
> > > least acknowledge them and tell me why do we have to pay those
> > > tradeoffs instead of just dismissing them as engineering taste.
> > >
> > > The more I work with MXNet the more I wish debugging was easier, and
> > > reading and refactoring the code, and those fields would be declared
> > > and typed in their corresponding data structures, for MXNet I don't
> > > think this would affect anything in regards the python bindings since
> > > they go through the typed C API anyway.
> > >
> > > Maybe we can get some inspiration from LLVM as they have bindings for
> > > many languages to work with the AST and have very clean APIs for the
> > > compilation steps. It's also OK to have an initial dynamic codebase
> > > for research and experimentation and then "cure" them into a solid
> > > maintainable one with more types and more robust...
> > >
> > > Pedro.
> > >
> > >
> > >
> > >
> > >
> > > On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <[email protected]>
> wrote:
> > > >
> > > > A good infrastructure design has a long way to go and has a profound
> impact on the project itself. That is why we always want to rethink if the
> interface can be better done, and think about the next possible
> infrastructure to make things better, Refactoring is certainly part of it.
> > > >
> > > > There are usually two types of refactoring we refers to :
> > > > 1) The major design change, in terms of class relations, data
> structures (e.g. numpy support, adding compilation to new hardware)
> > > > 2) The specific choice of API, programming style(more types or
> type-erased program)
> > > >
> > > > (1) affects the long term support of the project, introduces new
> features if necessary and need a lot of thoughts into that. I believe the
> general IR, compilation and numpy support belongs to that category.
> > > >
> > > > I would particularly like to talk about (2).
> > > > Because there is no unified correct answer in software engineering,
> different developers may prefer different views on a certain problem.
> > > > Some of them have things to do with the taste developers. The change
> could favor certain aspect of the project, but not necessarily another part.
> > > > Refactoring wrt these sometimes does require a more thoughtful
> conversation and make a reasonable compromise.
> > > >
> > > > For example, we have a recent discussion about whether to introduce
> more typing into the code base, to the extent that the base data structure
> could be templatized.
> > > > - The Pros of this approach
> > > >     - It introduces more typing and compile-time error
> message(instead of runtime checking), could help developers to find problem
> earlier.
> > > > - The Cons of the approach:
> > > >    - Having a template in the base data structure causes ABI
> problem(which code generated by DLL A vs DLL B) and will have potential
> future issues.
> > > >    - Template sometimes confuses some developers.
> > > >    - For serialization, it is hard to anticipate all kinds of
> classes and it is easier to have one class(any) that handles polymorphism.
> > > >    - Because of most frontends(python) are dynamic, it is easier to
> interface them with a type-erased API.
> > > >
> > > > As we can see there are pros and cons of bringing in more typing to
> the change, and there is no unified answer.
> > > > One good example of a nice infrastructure design trade-off is DLPack
> https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h
> > > > This is a base data structure adopted by MXNet, Pytorch, Chainer,
> and many other frameworks unanimously.
> > > > It is a type-erased data structure that erases the data type, and
> memory allocator from the data structure and is designed to exchange
> tensor(coming from different memory allocators) across DLL boundaries.
> > > > As you can see this is a good example of type-erased data structures.
> > > >
> > > > When we are having this kind of questions. It is important to have a
> good conversation. Sometimes we have to make tradeoffs rather than bend
> everyone-else to our will. This is what open source is about.
> > > > I would also like to give some examples of conversations and how
> design decisions are resolved. It comes from the TVM community's recent
> discussion about VM design.
> > > > I directly paste the github issue conversation here for the sake of
> clarity(note that all the conversations are also mirrored to dev@tvm).
> > > > The background is that the community want to bring a virtual machine
> that can execute dynamic operations more effectively.
> > > >
> > > > - The initial proposal, made by one of the committers gave a
> detailed design based on Stack VM https://github.com/dmlc/tvm/issues/2810
> > > >    - As you can see that there are quite some discussions about
> whether we want to use a different set of design, in this case, a
> register-based version.
> > > >    - The conversation evolves, and while the community members
> disagree on some cases, also agrees with each other on the particular
> tradeoffs.
> > > > - After some discussions, the committers bring a tradeoff design
> that tries to consolidate the needs of both sides and this is the final
> solution being adopted  https://github.com/dmlc/tvm/issues/2915
> > > > I would like to particularly highlight the fact that: 1) there are
> disagreements in the development process. 2) developers work together to
> understand each others' needs and then make consensus on a perhaps better
> design.
> > > >
> > > > There are two other particular conversations between Pedro and
> myself, which are during his contributions.
> > > > - https://github.com/dmlc/tvm/pull/3037 In this case, I raised the
> concern about API consistency, and Pedro brings up a reason why he thinks
> it is a better idea, I agreed and we merged the PR
> > > > - https://github.com/dmlc/tvm/pull/3108 In this other case, there
> are technical reasons for going both sides for the case of MXNet, we have
> listed pros/cons about both sides and have a constructive conversation.
> Eventually, I decided to not merge the PR after weighing in all the cases.
> > > >
> > > > I believe both are useful conversations, and while Pedro and I
> disagree sometimes, we do agree on many other cases. The most crucial part
> is about having a constructive conversation.
> > > > To summarize, I do refactoring and making things better is certainly
> important to make the project better. And I do believe it is crucial to
> think about all the technical consequences and make good tradeoff decisions.
> > > > Sometimes the decision may not make every developer mostly happy,
> but a good technical compromise could move the project forward and help the
> community in general.
> > > >
> > > > Tianqi
> > > >
> > > > On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <
> [email protected]> wrote:
> > > >>
> > > >>
> > > >>
> > > >> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy <
> [email protected]>:
> > > >> > I think Martin does a very good job explaining why
> > > >> >refactoring,
> > > >> >reducing developer frustration and internal improvement is a
> crucial
> > > >> >productivity multiplier which includes lower cost to ship features,
> > > >> >less
> > > >> >bugs and time spent debugging.
> > > >>
> > > >> There's one aspect that's special for open source projects: if a
> project wants to survive long term, it should make it easy for people to
> get started working on the project. In my experience, refactoring and
> cleanup play an important role in that. So thanks also for making
> recruiting of new contributers better.
> > > >>
> > > >> Isabel
> > > >> --
> > > >> This message was sent with K-9 from a mobile device with swipe to
> type enabled. I'm sorry for any embarrassing typos that slipped through.
>

Re: Does internal quality matters to users?

Reply via email to