Please note that this is a patch release(1.3.1) to address critical bugs!, For everything else please wait for 1.4.0 which is planned very shortly after 1.3.1
> On Nov 6, 2018, at 7:17 AM, Anton Chernov <[email protected]> wrote: > > The following PR's have been created so far: > > Infer dtype in SymbolBlock import from input symbol (v1.3.x) > https://github.com/apache/incubator-mxnet/pull/13117 > > [MXNET-953] Fix oob memory read (v1.3.x) > https://github.com/apache/incubator-mxnet/pull/13118 > > [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x) > https://github.com/apache/incubator-mxnet/pull/13119 > > [MXNET-922] Fix memleak in profiler (v1.3.x) > https://github.com/apache/incubator-mxnet/pull/13120 > > Set correct update on kvstore flag in dist_device_sync mode (v1.3.x) > https://github.com/apache/incubator-mxnet/pull/13121 > > update mshadow (v1.3.x) > https://github.com/apache/incubator-mxnet/pull/13122 > > CudnnFind() usage improvements (v1.3.x) > https://github.com/apache/incubator-mxnet/pull/13123 > > Fix lazy record io when used with dataloader and multi_worker > 0 (v1.3.x) > https://github.com/apache/incubator-mxnet/pull/13124 > > > As stated previously I would be rather opposed to have following PR's it in > the patch release: > > Gluon LSTM Projection and Clipping Support (#13055) v1.3.x > https://github.com/apache/incubator-mxnet/pull/13129 > > sample_like operators (#13034) v1.3.x > https://github.com/apache/incubator-mxnet/pull/13130 > > > Best > Anton > > вт, 6 нояб. 2018 г. в 16:06, Anton Chernov <[email protected]>: > >> Hi Haibin, >> >> I have a few comments regarding the proposed performance improvement >> changes. >> >> CUDNN support for LSTM with projection & clipping >> https://github.com/apache/incubator-mxnet/pull/13056 >> >> There is no doubt that this change brings value, but I don't see it as a >> critical bug fix. I would rather leave it for the next major release. >> >> sample_like operators >> https://github.com/apache/incubator-mxnet/pull/13034 >> >> Even if it's related to performance, this is an addition of functionality >> and I would also push this to be in the next major release only. >> >> >> Best >> Anton >> >> >> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov <[email protected]>: >> >>> Hi Patric, >>> >>> This change was listed in the 'PR candidates suggested for consideration >>> for v1.3.1 patch release' section [1]. >>> >>> You are right, I also think that this is not a critical hotfix change >>> that should be included into the 1.3.1 patch release. >>> >>> Thus I'm not making any further efforts to bring it in. >>> >>> Best >>> Anton >>> >>> [1] >>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates >>> >>> >>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric <[email protected]>: >>> >>>> Hi Anton, >>>> >>>> Thanks for looking into the MKL-DNN PR. >>>> >>>> As my understanding of cwiki ( >>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release >>>> ), >>>> these features will go into 1.4 rather than patch release of 1.3.1. >>>> >>>> Feel free to correct me :) >>>> >>>> Thanks, >>>> >>>> --Patric >>>> >>>>> -----Original Message----- >>>>> From: Anton Chernov [mailto:[email protected]] >>>>> Sent: Tuesday, November 6, 2018 3:11 AM >>>>> To: [email protected] >>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch >>>>> release >>>>> >>>>> It seems that there is a problem porting following changes to the >>>> v1.3.x >>>>> release branch: >>>>> >>>>> Implement mkldnn convolution fusion and quantization >>>>> https://github.com/apache/incubator-mxnet/pull/12530 >>>>> >>>>> MKL-DNN Quantization Examples and README >>>>> https://github.com/apache/incubator-mxnet/pull/12808 >>>>> >>>>> The bases are different. >>>>> >>>>> I would need help from authors of these changes to make a backport PR. >>>>> >>>>> @ZhennanQin, @xinyu-intel would you be able to assist me and create the >>>>> corresponding PR's? >>>>> >>>>> Without proper history and domain knowledge I would not be able to >>>> create >>>>> them by my own in reasonable amount of time, I'm afraid. >>>>> >>>>> Best regards, >>>>> Anton >>>>> >>>>> пн, 5 нояб. 2018 г. в 19:45, Anton Chernov <[email protected]>: >>>>> >>>>>> >>>>>> As part of: >>>>>> >>>>>> Implement mkldnn convolution fusion and quantization >>>>>> https://github.com/apache/incubator-mxnet/pull/12530 >>>>>> >>>>>> I propose to add the examples and documentation PR as well: >>>>>> >>>>>> MKL-DNN Quantization Examples and README >>>>>> https://github.com/apache/incubator-mxnet/pull/12808 >>>>>> >>>>>> >>>>>> Best regards, >>>>>> Anton >>>>>> >>>>>> пн, 5 нояб. 2018 г. в 19:02, Anton Chernov <[email protected]>: >>>>>> >>>>>>> Dear MXNet community, >>>>>>> >>>>>>> I will be the release manager for the upcoming 1.3.1 patch release. >>>>>>> Naveen will be co-managing the release and providing help from the >>>>>>> committers side. >>>>>>> >>>>>>> The following dates have been set: >>>>>>> >>>>>>> Code Freeze: 31st October 2018 >>>>>>> Release published: 13th November 2018 >>>>>>> >>>>>>> Release notes have been drafted here [1]. >>>>>>> >>>>>>> >>>>>>> * Known issues >>>>>>> >>>>>>> Update MKL-DNN dependency >>>>>>> https://github.com/apache/incubator-mxnet/pull/12953 >>>>>>> >>>>>>> This PR hasn't been merged even to master yet. Requires additional >>>>>>> discussion and merge. >>>>>>> >>>>>>> distributed kvstore bug in MXNet >>>>>>> https://github.com/apache/incubator-mxnet/issues/12713 >>>>>>> >>>>>>>> When distributed kvstore is used, by default gluon.Trainer doesn't >>>>>>>> work >>>>>>> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be >>>>>>> more specific, the trainer updates once per GPU, the LRScheduler >>>>>>> object is shared across GPUs and get a wrong update count. >>>>>>> >>>>>>> This needs to be fixed. [6] >>>>>>> >>>>>>> >>>>>>> * Changes >>>>>>> >>>>>>> The following changes will be ported to the release branch, per [2]: >>>>>>> >>>>>>> Infer dtype in SymbolBlock import from input symbol [3] >>>>>>> https://github.com/apache/incubator-mxnet/pull/12412 >>>>>>> >>>>>>> [MXNET-953] Fix oob memory read >>>>>>> https://github.com/apache/incubator-mxnet/pull/12631 >>>>>>> >>>>>>> [MXNET-969] Fix buffer overflow in RNNOp >>>>>>> https://github.com/apache/incubator-mxnet/pull/12603 >>>>>>> >>>>>>> [MXNET-922] Fix memleak in profiler >>>>>>> https://github.com/apache/incubator-mxnet/pull/12499 >>>>>>> >>>>>>> Implement mkldnn convolution fusion and quantization (MXNet Graph >>>>>>> Optimization and Quantization based on subgraph and MKL-DNN >>>>> proposal >>>>>>> [4]) >>>>>>> https://github.com/apache/incubator-mxnet/pull/12530 >>>>>>> >>>>>>> Following items (test cases) should be already part of 1.3.0: >>>>>>> >>>>>>> [MXNET-486] Create CPP test for concat MKLDNN operator >>>>>>> https://github.com/apache/incubator-mxnet/pull/11371 >>>>>>> >>>>>>> [MXNET-489] MKLDNN Pool test >>>>>>> https://github.com/apache/incubator-mxnet/pull/11608 >>>>>>> >>>>>>> [MXNET-484] MKLDNN C++ test for LRN operator >>>>>>> https://github.com/apache/incubator-mxnet/pull/11831 >>>>>>> >>>>>>> [MXNET-546] Add unit test for MKLDNNSum >>>>>>> https://github.com/apache/incubator-mxnet/pull/11272 >>>>>>> >>>>>>> [MXNET-498] Test MKLDNN backward operators >>>>>>> https://github.com/apache/incubator-mxnet/pull/11232 >>>>>>> >>>>>>> [MXNET-500] Test cases improvement for MKLDNN on Gluon >>>>>>> https://github.com/apache/incubator-mxnet/pull/10921 >>>>>>> >>>>>>> Set correct update on kvstore flag in dist_device_sync mode (as part >>>>>>> of fixing [5]) >>>>>>> https://github.com/apache/incubator-mxnet/pull/12786 >>>>>>> >>>>>>> upgrade mshadow version >>>>>>> https://github.com/apache/incubator-mxnet/pull/12692 >>>>>>> But another PR will be used instead: >>>>>>> update mshadow >>>>>>> https://github.com/apache/incubator-mxnet/pull/12674 >>>>>>> >>>>>>> CudnnFind() usage improvements >>>>>>> https://github.com/apache/incubator-mxnet/pull/12804 >>>>>>> A critical CUDNN fix that reduces GPU memory consumption and >>>>>>> addresses this memory leak issue. This is an important fix to >>>> include >>>>>>> in 1.3.1 >>>>>>> >>>>>>> >>>>>>> From discussion about gluon toolkits: >>>>>>> >>>>>>> disable opencv threading for forked process >>>>>>> https://github.com/apache/incubator-mxnet/pull/12025 >>>>>>> >>>>>>> Fix lazy record io when used with dataloader and multi_worker > 0 >>>>>>> https://github.com/apache/incubator-mxnet/pull/12554 >>>>>>> >>>>>>> fix potential floating number overflow, enable float16 >>>>>>> https://github.com/apache/incubator-mxnet/pull/12118 >>>>>>> >>>>>>> >>>>>>> >>>>>>> * Resolved issues >>>>>>> >>>>>>> MxNet 1.2.1–module get_outputs() >>>>>>> https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-outputs/1882 >>>>>>> >>>>>>> As far as I can see from the comments the issue has been resolved, >>>> no >>>>>>> actions need to be taken for this release. [7] is mentioned in this >>>>>>> regards, but I don't see any action points here either. >>>>>>> >>>>>>> >>>>>>> I will start with help of Naveen port the mentioned PR's to the >>>> 1.3.x >>>>>>> branch. >>>>>>> >>>>>>> >>>>>>> Best regards, >>>>>>> Anton >>>>>>> >>>>>>> [1] https://cwiki.apache.org/confluence/x/eZGzBQ >>>>>>> [2] >>>>>>> >>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+f >>>>>>> or+next+MXNet+Release [3] >>>>>>> https://github.com/apache/incubator-mxnet/issues/11849 >>>>>>> [4] >>>>>>> >>>>> https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimiz >>>>>>> ation+and+Quantization+based+on+subgraph+and+MKL-DNN >>>>>>> [5] https://github.com/apache/incubator-mxnet/issues/12713 >>>>>>> [6] >>>>>>> https://github.com/apache/incubator- >>>>> mxnet/issues/12713#issuecomment-4 >>>>>>> 35773777 [7] https://github.com/apache/incubator-mxnet/pull/11005 >>>>>>> >>>>>>> >>>> >>>
