Re: [Discussion] Remove bundled llvm OpenMP

Anton Chernov Wed, 22 May 2019 05:26:10 -0700

I don't have necessary rights to reopen this PR.

пн, 20 мая 2019 г. в 08:00, Pedro Larroy <[email protected]>:


> Hi Anton, Stas.
>
> Can we reopen this PR and get it merged as per the data collected by Stas?
>
> https://github.com/apache/incubator-mxnet/pull/12160
>
>
> https://cwiki.apache.org/confluence/display/MXNET/Benchmarking+MXNet+with+different+OpenMP+implementations
>
> There are multiple issues that will be fixed by solving this problem.
>
>
> Pedro
>
> On Tue, Feb 12, 2019 at 4:54 AM Anton Chernov <[email protected]> wrote:
> >
> > I would like to propose a possible alternative solution for
> consideration.
> >
> > If keeping llvm OpenMP as a submodule is inevitable one could make
> > following adjustments:
> >
> > Since compilers try to find their own OpenMP library implicitly, MXNet
> > needs to ensure that only the bundled version is found. Therefore during
> > the build and also during deployment this library has to provide symlinks
> > for each possible compiler that would link to the built artifact ie.
> >
> > libiomp.so -> libgomp.so -> libomp.so
> >
> > The MKLML iomp would need to be hidden and removed as well.
> >
> > On Windows it would be a different story, but as can be seen [1] bundled
> > OpenMP was not included in the Windows build anyway.
> >
> > Alternatively: always use iomp (with same symlinking trick though)
> provided
> > by MKLML distribution [2]. This potentially could work on Windows as
> well.
> >
> > Best
> > Anton
> >
> > [1]
> >
> https://github.com/apache/incubator-mxnet/blob/8a63bdecf2d9f12d34fe5874957ae4c867eb5f5b/CMakeLists.txt#L408-L410
> > [2] https://github.com/intel/mkl-dnn/releases
> >
> > вт, 12 февр. 2019 г. в 11:22, Anton Chernov <[email protected]>:
> >
> > > Recent benchmarking results have been published here [1]. Experiments
> > > compare different OpenMP implementations as well as binaries compiled
> with
> > > different compilers including GCC, Clang and ICC.
> > >
> > > During experimentation another issues with mixing up libraries was
> > > identified and described here [2].
> > >
> > > Best
> > > Anton
> > >
> > > [1] https://cwiki.apache.org/confluence/x/2wclBg
> > > [2]
> > >
> https://github.com/apache/incubator-mxnet/issues/14087#issuecomment-461734041
> > >
> > >
> > > вс, 9 дек. 2018 г. в 16:28, Anton Chernov <[email protected]>:
> > >
> > >> Hi Chris,
> > >>
> > >> Following up on the issue, are all things resolved in the discussion?
> > >>
> > >> If yes, I kindly ask you to reopen this PR and remove ‘requesting
> > >> changes’ status:
> > >> https://github.com/apache/incubator-mxnet/pull/12160
> > >>
> > >> Thank you.
> > >>
> > >>
> > >> Best
> > >> Anton
> > >>
> > >>
> > >> вт, 27 нояб. 2018 г. в 17:15, Anton Chernov <[email protected]>:
> > >>
> > >>> Another thing to take into consideration:
> > >>>
> > >>> All python artefacts that are created (PyPi) are built with make and
> are
> > >>> not using the bundled OpenMP library.
> > >>>
> > >>> One step for the switch to CMake to happen is the approval and
> merging
> > >>> of the mentioned PR:
> > >>>
> > >>> https://github.com/apache/incubator-mxnet/pull/12160
> > >>>
> > >>> If there are no other objections I kindly ask Chris Olivier to remove
> > >>> his 'requesting changes' veto on it to unblock the CMake overhaul
> work.
> > >>>
> > >>> Thank you.
> > >>>
> > >>> Best
> > >>> Anton
> > >>>
> > >>> чт, 22 нояб. 2018 г. в 17:11, Anton Chernov <[email protected]>:
> > >>>
> > >>>>
> > >>>> Thank you for you answer, Chris.
> > >>>>
> > >>>> > The whole “mixing omp libraries” is something that occurs in
> > >>>> production
> > >>>> every day and certainly in everything that uses mkl.
> > >>>>
> > >>>> I'm afraid this statement is wrong. Intel MKL-DNN strictly ensures
> that
> > >>>> this mixture is not happening:
> > >>>>
> > >>>> "Intel MKL-DNN uses OpenMP* for parallelism and requires an OpenMP
> > >>>> runtime library to work. As different OpenMP runtimes may not be
> binary
> > >>>> compatible it's important to ensure that only one OpenMP runtime is
> used
> > >>>> throughout the application. Having more than one OpenMP runtime
> initialized
> > >>>> may lead to undefined behavior resulting in incorrect results or
> crashes."
> > >>>> [1]
> > >>>>
> > >>>> That is why 2 different MKLML libraries are provided:
> > >>>>
> > >>>> lib/libmklml_gnu.so  | Intel MKL small library for GNU* OpenMP
> runtime
> > >>>> lib/libmklml_intel.so | Intel MKL small library for Intel(R) OpenMP
> > >>>> runtime
> > >>>>
> > >>>> > is the suggestion that libiomp be removed from mkl?
> > >>>>
> > >>>> That is certainly not my suggestion.
> > >>>>
> > >>>> > have you spoken with intel? have you consulted Intel at all?
> > >>>>
> > >>>> Yes, I have asked for comments on the issue.
> > >>>>
> > >>>> > “hard to debug random crash”. you’re seeing an assertion which is
> > >>>> probably ...
> > >>>>
> > >>>> I'm seeing the result of undefined behaviour. And I want to put
> > >>>> emphasis on the following statement:
> > >>>>
> > >>>> I disregards of whether there is a particular reason for the assert
> -
> > >>>> it is a result of behaviour that should not happen. There are valid
> ways
> > >>>> how to use llvm OpenMP in MXNet and the current way is not one of
> them.
> > >>>>
> > >>>> > The lack of root-causing the problem and knee-jerk solution here
> > >>>> makes me
> > >>>> uncomfortable.
> > >>>>
> > >>>> I hope that my efforts highlighting the problems reach you to
> mitigate
> > >>>> your uncomfort.
> > >>>>
> > >>>> > if you want to see performance differences there’s an environment
> > >>>> variable
> > >>>> you can set in the mxnet omp tuning code that will print overhead
> and
> > >>>> execution times for the current omp library.
> > >>>>
> > >>>> I don't want to see performance differences in the current OpenMP
> > >>>> library. I want to remove the current OpenMP library and use the one
> > >>>> provided by the compiler.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Best
> > >>>> Anton
> > >>>>
> > >>>> [1]
> https://github.com/intel/mkl-dnn/blame/master/README.md#L261-L265
> > >>>>
> > >>>> чт, 22 нояб. 2018 г. в 16:50, Chris Olivier <[email protected]
> >:
> > >>>>
> > >>>>> Do you not work on CI mostly? My apologies for thinking that was
> some
> > >>>>> sort
> > >>>>> of team effort between you and a few others that were passionate
> about
> > >>>>> CI
> > >>>>> keeping the CI system running smoothly.
> > >>>>>
> > >>>>> You have source code, you have the line the assertion is on. If you
> > >>>>> can’t
> > >>>>> describe what’s going wrong that causes the assertion, then I don’t
> > >>>>> really
> > >>>>> have anything more to add to this conversation beyond what’s below:
> > >>>>>
> > >>>>> The whole “mixing omp libraries” is something that occurs in
> production
> > >>>>> every day and certainly in everything that uses mkl.  It may
> > >>>>> occasionally
> > >>>>> cause problems for some edge cases when there is super-complex
> linking
> > >>>>> strategies and dynamic loading.  But this is not one of those edge
> > >>>>> cases.
> > >>>>> Mostly blaming this is a red herring for other thread-related
> problems
> > >>>>> and
> > >>>>> people switch omp library and the timing of their code changes and
> they
> > >>>>> stop seeing the problem. I’ve spent my entire career doing heavily
> > >>>>> multiphreaded c++ development and i’ve seen that a million times.
> is
> > >>>>> the
> > >>>>> suggestion that libiomp be removed from mkl? have you spoken with
> > >>>>> intel?
> > >>>>> have you consulted Intel at all?
> > >>>>>
> > >>>>> and what you are seeing isn’t some “hard to debug random crash”.
> you’re
> > >>>>> seeing an assertion which is probably related to omp trying to
> create a
> > >>>>> thread pool after a fork and something was done in the mxnet code
> to
> > >>>>> make
> > >>>>> that sketchy to do. I’d suggest filing an issue with the llvm
> openmp
> > >>>>> just
> > >>>>> like you’d file with any other not-well-understood behavior in
> mxnet.
> > >>>>>
> > >>>>> The lack of root-causing the problem and knee-jerk solution here
> makes
> > >>>>> me
> > >>>>> uncomfortable.
> > >>>>>
> > >>>>> if you want to see performance differences there’s an environment
> > >>>>> variable
> > >>>>> you can set in the mxnet omp tuning code that will print overhead
> and
> > >>>>> execution times for the current omp library.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Thu, Nov 22, 2018 at 7:12 AM Anton Chernov <[email protected]
> >
> > >>>>> wrote:
> > >>>>>
> > >>>>> > Hi Chris,
> > >>>>> >
> > >>>>> > Thank you for your answer. If you have noticed the initial email
> > >>>>> comes from
> > >>>>> > me, Anton Chernov (@lebeg on Github) and thus the proposal is not
> > >>>>> from any
> > >>>>> > 'Ci' team that you've mentioned, but from me personally.
> > >>>>> >
> > >>>>> > You are writing:
> > >>>>> >
> > >>>>> > > someone is doing something unhealthy when they fork ...
> > >>>>> >
> > >>>>> > I'm missing any context to understand what you mean.
> > >>>>> >
> > >>>>> > > we get a lot of performance gain from OMP ...
> > >>>>> >
> > >>>>> > There is no data that would prove this statement and therefore
> it is
> > >>>>> a
> > >>>>> > random guess.
> > >>>>> >
> > >>>>> > > in many months, no investigation has occurred as to WHY the
> > >>>>> assertion is
> > >>>>> > failing.
> > >>>>> >
> > >>>>> > The investigation has concluded that this is happening due to
> > >>>>> undefined
> > >>>>> > behaviour which is, in my opinion, a suffient answer that does
> not
> > >>>>> require
> > >>>>> > to go any deeper.
> > >>>>> >
> > >>>>> > > the pr is vetoed until such a time that the actual root cause
> of
> > >>>>> the
> > >>>>> > problem is known.
> > >>>>> >
> > >>>>> > And considering the statements above there is no valid reason to
> > >>>>> veto the
> > >>>>> > PR.
> > >>>>> >
> > >>>>> >
> > >>>>> > Best
> > >>>>> > Anton
> > >>>>> >
> > >>>>> > чт, 22 нояб. 2018 г. в 15:38, Chris Olivier <
> [email protected]>:
> > >>>>> >
> > >>>>> > > 3x less overhead*
> > >>>>> > >
> > >>>>> > > On Thu, Nov 22, 2018 at 6:25 AM Chris Olivier <
> > >>>>> [email protected]>
> > >>>>> > > wrote:
> > >>>>> > >
> > >>>>> > > > someone is doing something unhealthy when they fork, which is
> > >>>>> causing
> > >>>>> > an
> > >>>>> > > > assertion in the openmp library. the same assertion that
> would
> > >>>>> fire in
> > >>>>> > > mkl,
> > >>>>> > > > which is linked to libiomp5 (exact same omp library). this
> is new
> > >>>>> > > behavior
> > >>>>> > > > and most likely due to an error or suboptimal approach in the
> > >>>>> forking
> > >>>>> > > logic
> > >>>>> > > > in mxnet.
> > >>>>> > > >
> > >>>>> > > > in order to circumvent the assert, the Ci team is proposing
> to
> > >>>>> remove
> > >>>>> > the
> > >>>>> > > > library completely which is equivalent to cutting off your
> leg
> > >>>>> to make
> > >>>>> > > the
> > >>>>> > > > pain from stubbing your toe go away.
> > >>>>> > > >
> > >>>>> > > > we get a lot of performance gain from OMP. is has about a 1/3
> > >>>>> less
> > >>>>> > > > overhead for entering omp regions and also supports omp
> regions
> > >>>>> after a
> > >>>>> > > > fork, which libgomp does not.
> > >>>>> > > >
> > >>>>> > > > in many months, no investigation has occurred as to WHY the
> > >>>>> assertion
> > >>>>> > is
> > >>>>> > > > failing.
> > >>>>> > > >
> > >>>>> > > > the pr is vetoed until such a time that the actual root
> cause of
> > >>>>> the
> > >>>>> > > > problem is known.
> > >>>>> > > >
> > >>>>> > > >
> > >>>>> > > > thanks,
> > >>>>> > > >
> > >>>>> > > > -Chris.
> > >>>>> > > >
> > >>>>> > > >
> > >>>>> > > >
> > >>>>> > > >
> > >>>>> > > > On Thu, Nov 22, 2018 at 4:36 AM Anton Chernov <
> > >>>>> [email protected]>
> > >>>>> > > wrote:
> > >>>>> > > >
> > >>>>> > > >> Dear MXNet community,
> > >>>>> > > >>
> > >>>>> > > >> I would like to drive attention to an important issue that
> is
> > >>>>> present
> > >>>>> > in
> > >>>>> > > >> the MXNet CMake build: usage of bundled llvm OpenMP library.
> > >>>>> > > >>
> > >>>>> > > >> I have opened a PR to remove it:
> > >>>>> > > >> https://github.com/apache/incubator-mxnet/pull/12160
> > >>>>> > > >>
> > >>>>> > > >> The issue was closed, but I am strong in my oppinion that
> it's
> > >>>>> the
> > >>>>> > right
> > >>>>> > > >> thing to do.
> > >>>>> > > >>
> > >>>>> > > >> *Background*
> > >>>>> > > >> If you want to use OpenMP pragmas in your code for
> > >>>>> parallelization you
> > >>>>> > > >> would supply a special flag to the compiler:
> > >>>>> > > >>
> > >>>>> > > >> - Clang / -fopenmp
> > >>>>> > > >> https://openmp.llvm.org/
> > >>>>> > > >>
> > >>>>> > > >> - GCC / -fopenmp
> > >>>>> > > >> https://gcc.gnu.org/onlinedocs/libgomp/Enabling-OpenMP.html
> > >>>>> > > >>
> > >>>>> > > >> - Intel / [Q]openmp
> > >>>>> > > >>
> > >>>>> > > >>
> > >>>>> > >
> > >>>>> >
> > >>>>>
> https://software.intel.com/en-us/node/522689#6E24682E-F411-4AE3-A04D-ECD81C7008D1
> > >>>>> > > >>
> > >>>>> > > >> - Visual Studio: /openmp (Enable OpenMP 2.0 Support)
> > >>>>> > > >> https://msdn.microsoft.com/en-us/library/tt15eb9t.aspx
> > >>>>> > > >>
> > >>>>> > > >> Each of the compilers would enable the '#pragma omp'
> directive
> > >>>>> during
> > >>>>> > > >> C/C++
> > >>>>> > > >> compilation and arrange for automatic linking of the OpenMP
> > >>>>> runtime
> > >>>>> > > >> library
> > >>>>> > > >> supplied by each complier separately.
> > >>>>> > > >>
> > >>>>> > > >> Thus, to use the advantages of an OpenMP implementation one
> has
> > >>>>> to
> > >>>>> > > compile
> > >>>>> > > >> the code with the corresponding compiler.
> > >>>>> > > >>
> > >>>>> > > >> Currently, in MXNet CMake build scripts a bundled version of
> > >>>>> llvm
> > >>>>> > OpenMP
> > >>>>> > > >> is
> > >>>>> > > >> used ([1] and [2]) to replace the OpenMP library supplied
> by the
> > >>>>> > > compiler.
> > >>>>> > > >>
> > >>>>> > > >> I will quote here the README from the MKL-DNN (Intel(R) Math
> > >>>>> Kernel
> > >>>>> > > >> Library
> > >>>>> > > >> for Deep Neural Networks):
> > >>>>> > > >>
> > >>>>> > > >> "Intel MKL-DNN uses OpenMP* for parallelism and requires an
> > >>>>> OpenMP
> > >>>>> > > runtime
> > >>>>> > > >> library to work. As different OpenMP runtimes may not be
> binary
> > >>>>> > > compatible
> > >>>>> > > >> it's important to ensure that only one OpenMP runtime is
> used
> > >>>>> > throughout
> > >>>>> > > >> the application. Having more than one OpenMP runtime
> > >>>>> initialized may
> > >>>>> > > lead
> > >>>>> > > >> to undefined behavior resulting in incorrect results or
> > >>>>> crashes." [3]
> > >>>>> > > >>
> > >>>>> > > >> And:
> > >>>>> > > >>
> > >>>>> > > >> "Using GNU compiler with -fopenmp and -liomp5 options will
> link
> > >>>>> the
> > >>>>> > > >> application with both Intel and GNU OpenMP runtime
> libraries.
> > >>>>> This
> > >>>>> > will
> > >>>>> > > >> lead to undefined behavior of the application." [4]
> > >>>>> > > >>
> > >>>>> > > >> As can be seen from ldd for MXNet:
> > >>>>> > > >>
> > >>>>> > > >> $ ldd build/tests/mxnet_unit_tests | grep omp
> > >>>>> > > >>     libomp.so =>
> > >>>>> > /.../mxnet/build/3rdparty/openmp/runtime/src/libomp.so
> > >>>>> > > >> (0x00007f697bc55000)
> > >>>>> > > >>     libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1
> > >>>>> > > >> (0x00007f69660cd000)
> > >>>>> > > >>
> > >>>>> > > >> *Performance*
> > >>>>> > > >>
> > >>>>> > > >> The only performance data related to OpenMP in MXNet I was
> able
> > >>>>> to
> > >>>>> > find
> > >>>>> > > is
> > >>>>> > > >> here:
> > >>>>> > > >>
> > >>>>> > > >>
> > >>>>> > >
> > >>>>> >
> > >>>>>
> https://github.com/apache/incubator-mxnet/issues/9744#issuecomment-367711172
> > >>>>> > > >>
> > >>>>> > > >> Which in my understanding is testing imact of different
> > >>>>> environment
> > >>>>> > > >> variables for the same setup (using same bundled OpenMP
> > >>>>> library).
> > >>>>> > > >>
> > >>>>> > > >> The libraries may differ in implementation and the Thread
> > >>>>> Affinity
> > >>>>> > > >> Interface [5] may have significant impact on performance.
> > >>>>> > > >>
> > >>>>> > > >> All compliers support it:
> > >>>>> > > >>
> > >>>>> > > >> - Clang / KMP_AFFINITY
> > >>>>> > > >>
> > >>>>> > > >>
> > >>>>> > >
> > >>>>> >
> > >>>>>
> https://github.com/clang-ykt/openmp/blob/master/runtime/src/kmp_affinity.cpp
> > >>>>> > > >>
> > >>>>> > > >> - GCC / GOMP_CPU_AFFINITY
> > >>>>> > > >>
> > >>>>> > > >>
> > >>>>> > >
> > >>>>> >
> > >>>>>
> https://gcc.gnu.org/onlinedocs/gcc-4.7.1/libgomp/GOMP_005fCPU_005fAFFINITY.html
> > >>>>> > > >>
> > >>>>> > > >> - Intel / KMP_AFFINITY
> > >>>>> > > >>
> > >>>>> > > >>
> > >>>>> > >
> > >>>>> >
> > >>>>>
> https://software.intel.com/en-us/node/522689#6E24682E-F411-4AE3-A04D-ECD81C7008D1
> > >>>>> > > >>
> > >>>>> > > >> - Visual Studio / SetThreadAffinityMask
> > >>>>> > > >>
> > >>>>> > > >>
> > >>>>> > >
> > >>>>> >
> > >>>>>
> https://docs.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-setthreadaffinitymask
> > >>>>> > > >>
> > >>>>> > > >> *Issues*
> > >>>>> > > >>
> > >>>>> > > >> Failed OpenMP assertion when loading MXNet compiled with
> DEBUG=1
> > >>>>> > > >> https://github.com/apache/incubator-mxnet/issues/10856
> > >>>>> > > >>
> > >>>>> > > >> libomp.so dependency (need REAL fix)
> > >>>>> > > >> https://github.com/apache/incubator-mxnet/issues/11417
> > >>>>> > > >>
> > >>>>> > > >> mxnet-mkl (v0.12.0) crash when using (conda-installed) numpy
> > >>>>> with MKL
> > >>>>> > > >> https://github.com/apache/incubator-mxnet/issues/8532
> > >>>>> > > >>
> > >>>>> > > >> Performance regression when OMP_NUM_THREADS environment
> > >>>>> variable is
> > >>>>> > not
> > >>>>> > > >> set
> > >>>>> > > >> https://github.com/apache/incubator-mxnet/issues/9744
> > >>>>> > > >>
> > >>>>> > > >> Poor concat CPU performance on CUDA builds
> > >>>>> > > >> https://github.com/apache/incubator-mxnet/issues/11905
> > >>>>> > > >>
> > >>>>> > > >> I would appreciate hearing your thoughts.
> > >>>>> > > >>
> > >>>>> > > >>
> > >>>>> > > >> Best
> > >>>>> > > >> Anton
> > >>>>> > > >>
> > >>>>> > > >> [1]
> > >>>>> > > >>
> > >>>>> > > >>
> > >>>>> > >
> > >>>>> >
> > >>>>>
> https://github.com/apache/incubator-mxnet/blob/master/CMakeLists.txt#L400-L405
> > >>>>> > > >> [2]
> > >>>>> https://github.com/apache/incubator-mxnet/tree/master/3rdparty
> > >>>>> > > >> [3]
> > >>>>> https://github.com/intel/mkl-dnn/blame/master/README.md#L261-L265
> > >>>>> > > >> [4]
> > >>>>> https://github.com/intel/mkl-dnn/blame/master/README.md#L278-L280
> > >>>>> > > >> [5] https://software.intel.com/en-us/node/522691
> > >>>>> > > >>
> > >>>>> > > >
> > >>>>> > >
> > >>>>> >
> > >>>>>
> > >>>>
>

Re: [Discussion] Remove bundled llvm OpenMP

Reply via email to