+1 Thanks Marco for sharing this! It is great to see people agree with this feature and we actually have been planning for this for a while. We would love to share this plan as soon as possible.
On Mon, Apr 8, 2019 at 9:42 AM Tianqi Chen <[email protected]> wrote: > Just to clarify. I am not questioning the usefulness of the separation. > Just want to highlight the technical challenges here based on our past > experiences. > > Crossing DLL boundaries in C++ can create quite a lot of problems, > especially some of the dependencies used a different version of the > compiler, follows static packaging or simply because of the dynamic linking > difference in windows. These problems could make this direction move less > appealing compared to focusing effort on other things. > > Technically, as a first step, it is possible to make dependencies change > not change the global header files and via registration so that changing > certain component won't trigger a global recompile in CMake. This is also a > required step toward some modularity. > > For plugins, solutions that use C ABI can be used for certain plugin > modules. > > Some of the discussion has been tied to what the interface should look > like. I think we should use different threads for these and puts in more > thoughts. > > Tianqi > > > > On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland < > [email protected]> wrote: > > > I think we can make some incremental progress. My thoughts were along > the > > lines of plugins (thinking about what happens with the VLC project). At > > process launch time we could gather some information about our execution > > environment (either through configuration, or by convention looking at > our > > folder structure and libraries available). We could then later load the > > components we need after understanding if we're using a CUDA backend and > > what operators or subgraph components we would need. Advantages would be > > that we would move a lot of the current conditional compile logic to > > runtime, and automate a lot of it. It would also make packaging binaries > > for targeted environments a little easier. As an example we could > compile > > once, then remove CUDA focused libraries for systems that are going to > run > > on CPUs. > > > > On Sun, Apr 7, 2019 at 2:45 PM Tianqi Chen <[email protected]> > > wrote: > > > > > While I personally like the idea. This can be something that is fairly > > > technical challenging and I would caution against this idea vs pushing > > for > > > good features and just allow runtime configuration. > > > > > > The main problem here is due to the C++ ABI. There is no standard c++ > ABI > > > across compilers, which means resorting to runtime DLL and dynamic > > loading > > > brings all sorts of technical problems, especially when multiple > modules > > > depend on the same third dependency(CUDA runtime). > > > There is no good to go solution can be made here, especially given the > > > explosion of the backend variants and dependencies in C++. > > > A partial solution could be achieved, through the sole use of C ABI. > > > Combing this with code generation can result in some simplifications > and > > > enable some runtime loadable module. TVM does this, and perhaps MXNet > > could > > > reuse some of that component for operator libraries. Similarly, having > a > > > customizable operator library that is loadable via C ABI might be > > possible. > > > > > > So to summarize, while I really like the idea of dynamically loadable > > > modules. My past experience suggests that this will bring a lot of > > > additional engineering burden and technical debts without significant > > > benefit. I would suggest starting by supporting something simple like a > > > plugin module, before moving toward the general direction. > > > > > > Tianqi > > > > > > On Sun, Apr 7, 2019 at 1:31 PM kellen sunderland < > > > [email protected]> wrote: > > > > > > > Strongly support the idea of runtime loadable components in MXNet. > > > There's > > > > no reason (other than perhaps engineering effort) we can't have a > > single > > > > compilation of MXNet that finds dependencies and chooses execution > > paths > > > > intelligently (or based on configuration) at runtime. > > > > > > > > On Thu, Apr 4, 2019 at 12:29 PM Marco de Abreu < > [email protected]> > > > > wrote: > > > > > > > > > Hello, > > > > > > > > > > I'd like to start a discussion about something that I've noticed > > being > > > > > troublesome to maintain in the current version: Backend choices > being > > > > made > > > > > at compile time. > > > > > > > > > > Right now, the different backends and accelerators (CPU, cuda, mkl, > > AWS > > > > > elastic inference, (future) AMD, openblas,TVM, etc) are all > scattered > > > > > across the different layers of MXNet. On one hand, we have compile > > time > > > > > flags that decide which backends are being compiled into the > binary, > > > > while > > > > > at the same time choices can be made in the frontend during > runtime. > > > > > > > > > > At the moment, we have a lot of conditional build logic that picks > > > > > different parts. With the addition of MKLML and later MKLDNN the > > clear > > > > > separation of CPU and GPU got kind of broken up. While we have some > > > > places > > > > > where each code lives, in the end we resort to some files > containing > > a > > > > lot > > > > > of conditional logic for the different backends (sorry I can't > > provide > > > > > links right now since I'm on mobile). To me this seems like a > residue > > > of > > > > > the fast development style from the early days (more processor > > > statement > > > > > and less object orientation) while also having organic growth with > > new > > > > > accelerators. When I see how much AMD had to hack to fit in their > > > > > implementation, it seemed like we have to make this part more > > developer > > > > > friendly. > > > > > > > > > > At the moment, every new flavour of MXNet has to be entirely > > > recompiled. > > > > > This makes it hard for users to figure out which options to use, > > while > > > it > > > > > makes it harder for us to test since the overhead to test every > > single > > > > > combination of compile parameters would be overwhelming. > > > > > > > > > > I'd propose to have a clear class hierarchy based structure for > > > > > accelerators, operators and memory management. This structure can > > then > > > be > > > > > implemented by the different backends. To reduce the compile > burden, > > we > > > > > would introduce dynamic loading and split the different backends > into > > > > > modules. These could then be developed, maintained and compiled on > > > their > > > > > own and then placed in a "module" folder to be loaded at runtime. > > > Adding > > > > a > > > > > new accelerator would be a matter of placing the precompiled binary > > > into > > > > > the folder. The detailed configuration of that Backend would then > be > > > done > > > > > on runtime - the user shouldn't worry at the point of downloading > > mxnet > > > > > whether they want mkl, MKLDNN, mkl, openblas, atlas, TVM, cuda or > > what > > > > ever > > > > > else there is. I have an idea how we could help the user choosing, > > but > > > > > that's outside the scope of this proposal. > > > > > > > > > > This would allow us to have a "core" MXNet that takes care of the > > > engine, > > > > > scheduling, communication and all the other crucial parts. On the > > other > > > > > hand we could make MXNet less of a monolith and have clear > > interfaces. > > > > This > > > > > would also act as a forcing function because the different parts > > > wouldn't > > > > > be intermingled but have to follow the common interface. > > > > > > > > > > Of course this comes with the question what these interfaces would > > look > > > > > like. For operators, I'd like to propose getting inspiring (or > fully > > > > > adapting) ONNX. For memory management and other Backend specific > > things > > > > we > > > > > could look at the current implementations and find a common ground. > > > > > > > > > > Back when I had a community driven project, we heavily used this > > > > modularity > > > > > and it brought great benefits - besides the fact that our core was > > > closed > > > > > source. It allowed community developers to act entirely independent > > > from > > > > > other parts and even allowed them to add their own logic without > > having > > > > to > > > > > touch the core. Thinking about companies that implement their own > > > > backends > > > > > or have special tweaked operators without wanting to disclose them, > > > this > > > > > structure would avoid them having to fork the project and then > spend > > a > > > > lot > > > > > of effort porting the changes to the latest source release > versions. > > > > > Instead, they would maintain their module and we as MXNet community > > > would > > > > > only have to maintain these interfaces. > > > > > > > > > > Right now this is a lot of prosa and basically a brain dump of my > > > > thoughts. > > > > > I'd be happy to follow up with details, but first I'd be curious > what > > > the > > > > > community thinks about this design. > > > > > > > > > > Best regards, > > > > > Marco > > > > > > > > > > > > > > >
