+1. While I like slack, personally, I don't think we should treat slack as public-archive. "everything that happens (also) happens in dev@"
Tianqi On Fri, Apr 12, 2019 at 1:19 AM Marco de Abreu <[email protected]> wrote: > I'd prefer if we keep discussions on the dev-list instead of slack - feel > free to open another thread. > > -Marco > > Pedro Larroy <[email protected]> schrieb am Fr., 12. Apr. 2019, > 02:24: > > > I will respond in slack, so we don't derail the original thread's > > topic with my points. > > > > Looking forward to your proposal. > > > > On Thu, Apr 11, 2019 at 1:00 PM Junru Shao <[email protected]> > > wrote: > > > > > > I don't have idea about the following issues: > > > > > > 1) Reducing the abuse of inlined code moving more logic to > implementation > > > files and improve scoping which will also speed up compilation > > > 2) Reduce runtime of some unit tests > > > 3) Improve MXNet startup time > > > > > > Will be super interested to hear about your ideas :-) > > > > > > > > > On Thu, Apr 11, 2019 at 12:52 PM Junru Shao <[email protected]> > > wrote: > > > > > > > We have a systematic solution to go without ABI headache. I am > > struggling > > > > with some errants, and will share our proposal here as soon as I > could. > > > > This will be very interesting topic to discuss. Let's work hard > > together > > > > and make it perfect :-) > > > > > > > > On Thu, Apr 11, 2019 at 12:43 PM Pedro Larroy < > > > > [email protected]> wrote: > > > > > > > >> Thanks Marco for raising this issue. I think we can certainly do > some > > > >> improvements in modularization and build. At the same time Tianqi's > > > >> point of view is important to consider and on point. I see a high > risk > > > >> of overengineering in such endeavor. > > > >> > > > >> I also see increased complexity, difficulty debugging, C++ ABI > > > >> headaches, API compatibility, crashes inside a binary module, etc. > > > >> which I don't want to deal with as a developer or even as an MXNet > > > >> user. Does somebody have answers to these problems? > > > >> > > > >> If somebody thinks they have a good solution, by all means propose a > > > >> design in the wiki, I think we are all open. Personally I see > several > > > >> other lower hanging fruits which need our attention: > > > >> * Simplifying our build logic, > > > >> * Cuda selection in CMake, > > > >> * Reducing the abuse of inlined code moving more logic to > > > >> implementation files and improve scoping which will also speed up > > > >> compilation, (some units take more than 5 minutes to build and lots > of > > > >> RAM in a top of the line CPU core) > > > >> * Reduce runtime of some unit tests > > > >> And other improvements in our codebase that would bring immediate > > > >> benefits without the risks of overengineering of a plugin system. I > > > >> also question our bandwidth for such an endeavor. > > > >> * Improve MXNet startup time. > > > >> * Thread safety > > > >> > > > >> I would say, let's apply the KISS principle, let's make the project > > > >> fast to build, easy to work on, well documented and easy to > contribute > > > >> to before building the next Netscape browser. Otherwise we could > save > > > >> ourselves this exercise and switch to Rust directly. > > > >> > > > >> Pedro. > > > >> > > > >> > > > >> > > > >> On Mon, Apr 8, 2019 at 9:42 AM Tianqi Chen < > [email protected]> > > > >> wrote: > > > >> > > > > >> > Just to clarify. I am not questioning the usefulness of the > > separation. > > > >> > Just want to highlight the technical challenges here based on our > > past > > > >> > experiences. > > > >> > > > > >> > Crossing DLL boundaries in C++ can create quite a lot of problems, > > > >> > especially some of the dependencies used a different version of > the > > > >> > compiler, follows static packaging or simply because of the > dynamic > > > >> linking > > > >> > difference in windows. These problems could make this direction > move > > > >> less > > > >> > appealing compared to focusing effort on other things. > > > >> > > > > >> > Technically, as a first step, it is possible to make dependencies > > change > > > >> > not change the global header files and via registration so that > > changing > > > >> > certain component won't trigger a global recompile in CMake. This > is > > > >> also a > > > >> > required step toward some modularity. > > > >> > > > > >> > For plugins, solutions that use C ABI can be used for certain > plugin > > > >> > modules. > > > >> > > > > >> > Some of the discussion has been tied to what the interface should > > look > > > >> > like. I think we should use different threads for these and puts > in > > more > > > >> > thoughts. > > > >> > > > > >> > Tianqi > > > >> > > > > >> > > > > >> > > > > >> > On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland < > > > >> > [email protected]> wrote: > > > >> > > > > >> > > I think we can make some incremental progress. My thoughts were > > > >> along the > > > >> > > lines of plugins (thinking about what happens with the VLC > > project). > > > >> At > > > >> > > process launch time we could gather some information about our > > > >> execution > > > >> > > environment (either through configuration, or by convention > > looking > > > >> at our > > > >> > > folder structure and libraries available). We could then later > > load > > > >> the > > > >> > > components we need after understanding if we're using a CUDA > > backend > > > >> and > > > >> > > what operators or subgraph components we would need. Advantages > > > >> would be > > > >> > > that we would move a lot of the current conditional compile > logic > > to > > > >> > > runtime, and automate a lot of it. It would also make packaging > > > >> binaries > > > >> > > for targeted environments a little easier. As an example we > could > > > >> compile > > > >> > > once, then remove CUDA focused libraries for systems that are > > going > > > >> to run > > > >> > > on CPUs. > > > >> > > > > > >> > > On Sun, Apr 7, 2019 at 2:45 PM Tianqi Chen < > > [email protected]> > > > >> > > wrote: > > > >> > > > > > >> > > > While I personally like the idea. This can be something that > is > > > >> fairly > > > >> > > > technical challenging and I would caution against this idea vs > > > >> pushing > > > >> > > for > > > >> > > > good features and just allow runtime configuration. > > > >> > > > > > > >> > > > The main problem here is due to the C++ ABI. There is no > > standard > > > >> c++ ABI > > > >> > > > across compilers, which means resorting to runtime DLL and > > dynamic > > > >> > > loading > > > >> > > > brings all sorts of technical problems, especially when > multiple > > > >> modules > > > >> > > > depend on the same third dependency(CUDA runtime). > > > >> > > > There is no good to go solution can be made here, especially > > given > > > >> the > > > >> > > > explosion of the backend variants and dependencies in C++. > > > >> > > > A partial solution could be achieved, through the sole use of > C > > ABI. > > > >> > > > Combing this with code generation can result in some > > > >> simplifications and > > > >> > > > enable some runtime loadable module. TVM does this, and > perhaps > > > >> MXNet > > > >> > > could > > > >> > > > reuse some of that component for operator libraries. > Similarly, > > > >> having a > > > >> > > > customizable operator library that is loadable via C ABI might > > be > > > >> > > possible. > > > >> > > > > > > >> > > > So to summarize, while I really like the idea of dynamically > > > >> loadable > > > >> > > > modules. My past experience suggests that this will bring a > lot > > of > > > >> > > > additional engineering burden and technical debts without > > > >> significant > > > >> > > > benefit. I would suggest starting by supporting something > simple > > > >> like a > > > >> > > > plugin module, before moving toward the general direction. > > > >> > > > > > > >> > > > Tianqi > > > >> > > > > > > >> > > > On Sun, Apr 7, 2019 at 1:31 PM kellen sunderland < > > > >> > > > [email protected]> wrote: > > > >> > > > > > > >> > > > > Strongly support the idea of runtime loadable components in > > MXNet. > > > >> > > > There's > > > >> > > > > no reason (other than perhaps engineering effort) we can't > > have a > > > >> > > single > > > >> > > > > compilation of MXNet that finds dependencies and chooses > > execution > > > >> > > paths > > > >> > > > > intelligently (or based on configuration) at runtime. > > > >> > > > > > > > >> > > > > On Thu, Apr 4, 2019 at 12:29 PM Marco de Abreu < > > > >> [email protected]> > > > >> > > > > wrote: > > > >> > > > > > > > >> > > > > > Hello, > > > >> > > > > > > > > >> > > > > > I'd like to start a discussion about something that I've > > noticed > > > >> > > being > > > >> > > > > > troublesome to maintain in the current version: Backend > > choices > > > >> being > > > >> > > > > made > > > >> > > > > > at compile time. > > > >> > > > > > > > > >> > > > > > Right now, the different backends and accelerators (CPU, > > cuda, > > > >> mkl, > > > >> > > AWS > > > >> > > > > > elastic inference, (future) AMD, openblas,TVM, etc) are > all > > > >> scattered > > > >> > > > > > across the different layers of MXNet. On one hand, we have > > > >> compile > > > >> > > time > > > >> > > > > > flags that decide which backends are being compiled into > the > > > >> binary, > > > >> > > > > while > > > >> > > > > > at the same time choices can be made in the frontend > during > > > >> runtime. > > > >> > > > > > > > > >> > > > > > At the moment, we have a lot of conditional build logic > that > > > >> picks > > > >> > > > > > different parts. With the addition of MKLML and later > > MKLDNN the > > > >> > > clear > > > >> > > > > > separation of CPU and GPU got kind of broken up. While we > > have > > > >> some > > > >> > > > > places > > > >> > > > > > where each code lives, in the end we resort to some files > > > >> containing > > > >> > > a > > > >> > > > > lot > > > >> > > > > > of conditional logic for the different backends (sorry I > > can't > > > >> > > provide > > > >> > > > > > links right now since I'm on mobile). To me this seems > like > > a > > > >> residue > > > >> > > > of > > > >> > > > > > the fast development style from the early days (more > > processor > > > >> > > > statement > > > >> > > > > > and less object orientation) while also having organic > > growth > > > >> with > > > >> > > new > > > >> > > > > > accelerators. When I see how much AMD had to hack to fit > in > > > >> their > > > >> > > > > > implementation, it seemed like we have to make this part > > more > > > >> > > developer > > > >> > > > > > friendly. > > > >> > > > > > > > > >> > > > > > At the moment, every new flavour of MXNet has to be > entirely > > > >> > > > recompiled. > > > >> > > > > > This makes it hard for users to figure out which options > to > > use, > > > >> > > while > > > >> > > > it > > > >> > > > > > makes it harder for us to test since the overhead to test > > every > > > >> > > single > > > >> > > > > > combination of compile parameters would be overwhelming. > > > >> > > > > > > > > >> > > > > > I'd propose to have a clear class hierarchy based > structure > > for > > > >> > > > > > accelerators, operators and memory management. This > > structure > > > >> can > > > >> > > then > > > >> > > > be > > > >> > > > > > implemented by the different backends. To reduce the > compile > > > >> burden, > > > >> > > we > > > >> > > > > > would introduce dynamic loading and split the different > > > >> backends into > > > >> > > > > > modules. These could then be developed, maintained and > > compiled > > > >> on > > > >> > > > their > > > >> > > > > > own and then placed in a "module" folder to be loaded at > > > >> runtime. > > > >> > > > Adding > > > >> > > > > a > > > >> > > > > > new accelerator would be a matter of placing the > precompiled > > > >> binary > > > >> > > > into > > > >> > > > > > the folder. The detailed configuration of that Backend > would > > > >> then be > > > >> > > > done > > > >> > > > > > on runtime - the user shouldn't worry at the point of > > > >> downloading > > > >> > > mxnet > > > >> > > > > > whether they want mkl, MKLDNN, mkl, openblas, atlas, TVM, > > cuda > > > >> or > > > >> > > what > > > >> > > > > ever > > > >> > > > > > else there is. I have an idea how we could help the user > > > >> choosing, > > > >> > > but > > > >> > > > > > that's outside the scope of this proposal. > > > >> > > > > > > > > >> > > > > > This would allow us to have a "core" MXNet that takes care > > of > > > >> the > > > >> > > > engine, > > > >> > > > > > scheduling, communication and all the other crucial parts. > > On > > > >> the > > > >> > > other > > > >> > > > > > hand we could make MXNet less of a monolith and have clear > > > >> > > interfaces. > > > >> > > > > This > > > >> > > > > > would also act as a forcing function because the different > > parts > > > >> > > > wouldn't > > > >> > > > > > be intermingled but have to follow the common interface. > > > >> > > > > > > > > >> > > > > > Of course this comes with the question what these > interfaces > > > >> would > > > >> > > look > > > >> > > > > > like. For operators, I'd like to propose getting inspiring > > (or > > > >> fully > > > >> > > > > > adapting) ONNX. For memory management and other Backend > > specific > > > >> > > things > > > >> > > > > we > > > >> > > > > > could look at the current implementations and find a > common > > > >> ground. > > > >> > > > > > > > > >> > > > > > Back when I had a community driven project, we heavily > used > > this > > > >> > > > > modularity > > > >> > > > > > and it brought great benefits - besides the fact that our > > core > > > >> was > > > >> > > > closed > > > >> > > > > > source. It allowed community developers to act entirely > > > >> independent > > > >> > > > from > > > >> > > > > > other parts and even allowed them to add their own logic > > without > > > >> > > having > > > >> > > > > to > > > >> > > > > > touch the core. Thinking about companies that implement > > their > > > >> own > > > >> > > > > backends > > > >> > > > > > or have special tweaked operators without wanting to > > disclose > > > >> them, > > > >> > > > this > > > >> > > > > > structure would avoid them having to fork the project and > > then > > > >> spend > > > >> > > a > > > >> > > > > lot > > > >> > > > > > of effort porting the changes to the latest source release > > > >> versions. > > > >> > > > > > Instead, they would maintain their module and we as MXNet > > > >> community > > > >> > > > would > > > >> > > > > > only have to maintain these interfaces. > > > >> > > > > > > > > >> > > > > > Right now this is a lot of prosa and basically a brain > dump > > of > > > >> my > > > >> > > > > thoughts. > > > >> > > > > > I'd be happy to follow up with details, but first I'd be > > > >> curious what > > > >> > > > the > > > >> > > > > > community thinks about this design. > > > >> > > > > > > > > >> > > > > > Best regards, > > > >> > > > > > Marco > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > >
