Hi Sheng, thanks for you suggestions. Personally, I would not rush with new major release as this breaks the pace and creates unnecessary pressure in my opinion.
If the changes suggested by Haibin are really important then I think we can consider them for the minor release, even if they are not strictly speaking *bugfixes*. Do you think that might be an option? And did I understand correctly, you are suggesting: [MXNET-1179] Enforce deterministic algorithms in convolution layers https://github.com/apache/incubator-mxnet/pull/12992 for the 1.3.1 release? Best Anton ср, 7 нояб. 2018 г. в 0:59, Sheng Zha <[email protected]>: > Similar to the two PRs that Haibin suggested, 12992 introduces new > interface for controlling determinism, which is better suited for minor > release. > > I think other than lack of release manager to drive 1.4.0 release, there’s > no reason we cannot do two releases (1.4.0 & 1.3.1) at the same time. I’m > willing to help with the 1.4.0 release to make these new features available > one month sooner, if there’s no other concern. > > -sz > > > On Nov 6, 2018, at 3:30 PM, Lin Yuan <[email protected]> wrote: > > > > Hi Anton, > > > > Thanks for helping the release. > > The following PRs are needed by customers who want to use deterministic > > CUDNN convolution algorithms: > > > > https://github.com/apache/incubator-mxnet/pull/12992 > > https://github.com/apache/incubator-mxnet/pull/13049 > > > > Thanks! > > > > Lin > > > > > > On Tue, Nov 6, 2018 at 1:51 PM Aaron Markham <[email protected]> > > wrote: > > > >> Hi Anton, > >> I have the following suggestions for fixes to include in 1.3.1. These > each > >> have updates to files that will impact docs generation for the 1.3.x > >> version of the website's Python API docs: > >> > >> https://github.com/apache/incubator-mxnet/pull/12879 > >> https://github.com/apache/incubator-mxnet/pull/12871 > >> https://github.com/apache/incubator-mxnet/pull/12856 > >> > >> Thanks, > >> Aaron > >> > >>> On Tue, Nov 6, 2018 at 1:29 PM Lai Wei <[email protected]> wrote: > >>> > >>> Hi Anton, > >>> > >>> Thanks for driving this, I would like to include the following fix in > >>> 1.3.1: > >>> Allow infer shape partial on foreach operator: > >>> https://github.com/apache/incubator-mxnet/pull/12471 > >>> > >>> Keras-MXNet needs this functionality to infer shape partially > >>> on foreach operator. (Used in RNN operators) > >>> > >>> Thanks a lot! > >>> > >>> > >>> Best Regards > >>> Lai Wei > >>> > >>> > >>> > >>> On Tue, Nov 6, 2018 at 10:44 AM Haibin Lin <[email protected]> > >>> wrote: > >>> > >>>> Hi Naveen and Anton, > >>>> > >>>> Thanks for pointing that out. You are right that these are not > critical > >>>> fixes. Putting them in 1.4.0 is more appropriate. PRs are closed. > >>>> > >>>> Best, > >>>> Haibin > >>>> > >>>> On Tue, Nov 6, 2018 at 7:35 AM Naveen Swamy <[email protected]> > >> wrote: > >>>> > >>>>> Please note that this is a patch release(1.3.1) to address critical > >>>> bugs!, > >>>>> For everything else please wait for 1.4.0 which is planned very > >> shortly > >>>>> after 1.3.1 > >>>>> > >>>>>> On Nov 6, 2018, at 7:17 AM, Anton Chernov <[email protected]> > >>> wrote: > >>>>>> > >>>>>> The following PR's have been created so far: > >>>>>> > >>>>>> Infer dtype in SymbolBlock import from input symbol (v1.3.x) > >>>>>> https://github.com/apache/incubator-mxnet/pull/13117 > >>>>>> > >>>>>> [MXNET-953] Fix oob memory read (v1.3.x) > >>>>>> https://github.com/apache/incubator-mxnet/pull/13118 > >>>>>> > >>>>>> [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x) > >>>>>> https://github.com/apache/incubator-mxnet/pull/13119 > >>>>>> > >>>>>> [MXNET-922] Fix memleak in profiler (v1.3.x) > >>>>>> https://github.com/apache/incubator-mxnet/pull/13120 > >>>>>> > >>>>>> Set correct update on kvstore flag in dist_device_sync mode > >> (v1.3.x) > >>>>>> https://github.com/apache/incubator-mxnet/pull/13121 > >>>>>> > >>>>>> update mshadow (v1.3.x) > >>>>>> https://github.com/apache/incubator-mxnet/pull/13122 > >>>>>> > >>>>>> CudnnFind() usage improvements (v1.3.x) > >>>>>> https://github.com/apache/incubator-mxnet/pull/13123 > >>>>>> > >>>>>> Fix lazy record io when used with dataloader and multi_worker > 0 > >>>>> (v1.3.x) > >>>>>> https://github.com/apache/incubator-mxnet/pull/13124 > >>>>>> > >>>>>> > >>>>>> As stated previously I would be rather opposed to have following > >> PR's > >>>> it > >>>>> in > >>>>>> the patch release: > >>>>>> > >>>>>> Gluon LSTM Projection and Clipping Support (#13055) v1.3.x > >>>>>> https://github.com/apache/incubator-mxnet/pull/13129 > >>>>>> > >>>>>> sample_like operators (#13034) v1.3.x > >>>>>> https://github.com/apache/incubator-mxnet/pull/13130 > >>>>>> > >>>>>> > >>>>>> Best > >>>>>> Anton > >>>>>> > >>>>>> вт, 6 нояб. 2018 г. в 16:06, Anton Chernov <[email protected]>: > >>>>>> > >>>>>>> Hi Haibin, > >>>>>>> > >>>>>>> I have a few comments regarding the proposed performance > >> improvement > >>>>>>> changes. > >>>>>>> > >>>>>>> CUDNN support for LSTM with projection & clipping > >>>>>>> https://github.com/apache/incubator-mxnet/pull/13056 > >>>>>>> > >>>>>>> There is no doubt that this change brings value, but I don't see > >> it > >>>> as a > >>>>>>> critical bug fix. I would rather leave it for the next major > >>> release. > >>>>>>> > >>>>>>> sample_like operators > >>>>>>> https://github.com/apache/incubator-mxnet/pull/13034 > >>>>>>> > >>>>>>> Even if it's related to performance, this is an addition of > >>>>> functionality > >>>>>>> and I would also push this to be in the next major release only. > >>>>>>> > >>>>>>> > >>>>>>> Best > >>>>>>> Anton > >>>>>>> > >>>>>>> > >>>>>>> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov <[email protected]>: > >>>>>>> > >>>>>>>> Hi Patric, > >>>>>>>> > >>>>>>>> This change was listed in the 'PR candidates suggested for > >>>>> consideration > >>>>>>>> for v1.3.1 patch release' section [1]. > >>>>>>>> > >>>>>>>> You are right, I also think that this is not a critical hotfix > >>> change > >>>>>>>> that should be included into the 1.3.1 patch release. > >>>>>>>> > >>>>>>>> Thus I'm not making any further efforts to bring it in. > >>>>>>>> > >>>>>>>> Best > >>>>>>>> Anton > >>>>>>>> > >>>>>>>> [1] > >>>>>>>> > >>>>> > >>>> > >>> > >> > https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates > >>>>>>>> > >>>>>>>> > >>>>>>>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric <[email protected] > >>> : > >>>>>>>> > >>>>>>>>> Hi Anton, > >>>>>>>>> > >>>>>>>>> Thanks for looking into the MKL-DNN PR. > >>>>>>>>> > >>>>>>>>> As my understanding of cwiki ( > >>>>>>>>> > >>>>> > >>>> > >>> > >> > https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release > >>>>>>>>> ), > >>>>>>>>> these features will go into 1.4 rather than patch release of > >>> 1.3.1. > >>>>>>>>> > >>>>>>>>> Feel free to correct me :) > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> > >>>>>>>>> --Patric > >>>>>>>>> > >>>>>>>>>> -----Original Message----- > >>>>>>>>>> From: Anton Chernov [mailto:[email protected]] > >>>>>>>>>> Sent: Tuesday, November 6, 2018 3:11 AM > >>>>>>>>>> To: [email protected] > >>>>>>>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating) > >> 1.3.1 > >>>>> patch > >>>>>>>>>> release > >>>>>>>>>> > >>>>>>>>>> It seems that there is a problem porting following changes to > >> the > >>>>>>>>> v1.3.x > >>>>>>>>>> release branch: > >>>>>>>>>> > >>>>>>>>>> Implement mkldnn convolution fusion and quantization > >>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12530 > >>>>>>>>>> > >>>>>>>>>> MKL-DNN Quantization Examples and README > >>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12808 > >>>>>>>>>> > >>>>>>>>>> The bases are different. > >>>>>>>>>> > >>>>>>>>>> I would need help from authors of these changes to make a > >>> backport > >>>>> PR. > >>>>>>>>>> > >>>>>>>>>> @ZhennanQin, @xinyu-intel would you be able to assist me and > >>> create > >>>>> the > >>>>>>>>>> corresponding PR's? > >>>>>>>>>> > >>>>>>>>>> Without proper history and domain knowledge I would not be able > >>> to > >>>>>>>>> create > >>>>>>>>>> them by my own in reasonable amount of time, I'm afraid. > >>>>>>>>>> > >>>>>>>>>> Best regards, > >>>>>>>>>> Anton > >>>>>>>>>> > >>>>>>>>>> пн, 5 нояб. 2018 г. в 19:45, Anton Chernov < > >> [email protected] > >>>> : > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> As part of: > >>>>>>>>>>> > >>>>>>>>>>> Implement mkldnn convolution fusion and quantization > >>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12530 > >>>>>>>>>>> > >>>>>>>>>>> I propose to add the examples and documentation PR as well: > >>>>>>>>>>> > >>>>>>>>>>> MKL-DNN Quantization Examples and README > >>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12808 > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Best regards, > >>>>>>>>>>> Anton > >>>>>>>>>>> > >>>>>>>>>>> пн, 5 нояб. 2018 г. в 19:02, Anton Chernov < > >> [email protected] > >>>> : > >>>>>>>>>>> > >>>>>>>>>>>> Dear MXNet community, > >>>>>>>>>>>> > >>>>>>>>>>>> I will be the release manager for the upcoming 1.3.1 patch > >>>> release. > >>>>>>>>>>>> Naveen will be co-managing the release and providing help > >> from > >>>> the > >>>>>>>>>>>> committers side. > >>>>>>>>>>>> > >>>>>>>>>>>> The following dates have been set: > >>>>>>>>>>>> > >>>>>>>>>>>> Code Freeze: 31st October 2018 > >>>>>>>>>>>> Release published: 13th November 2018 > >>>>>>>>>>>> > >>>>>>>>>>>> Release notes have been drafted here [1]. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> * Known issues > >>>>>>>>>>>> > >>>>>>>>>>>> Update MKL-DNN dependency > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12953 > >>>>>>>>>>>> > >>>>>>>>>>>> This PR hasn't been merged even to master yet. Requires > >>>> additional > >>>>>>>>>>>> discussion and merge. > >>>>>>>>>>>> > >>>>>>>>>>>> distributed kvstore bug in MXNet > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/issues/12713 > >>>>>>>>>>>> > >>>>>>>>>>>>> When distributed kvstore is used, by default gluon.Trainer > >>>> doesn't > >>>>>>>>>>>>> work > >>>>>>>>>>>> with mx.optimizer.LRScheduler if a worker has more than 1 > >> GPU. > >>> To > >>>>> be > >>>>>>>>>>>> more specific, the trainer updates once per GPU, the > >>> LRScheduler > >>>>>>>>>>>> object is shared across GPUs and get a wrong update count. > >>>>>>>>>>>> > >>>>>>>>>>>> This needs to be fixed. [6] > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> * Changes > >>>>>>>>>>>> > >>>>>>>>>>>> The following changes will be ported to the release branch, > >> per > >>>>> [2]: > >>>>>>>>>>>> > >>>>>>>>>>>> Infer dtype in SymbolBlock import from input symbol [3] > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12412 > >>>>>>>>>>>> > >>>>>>>>>>>> [MXNET-953] Fix oob memory read > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12631 > >>>>>>>>>>>> > >>>>>>>>>>>> [MXNET-969] Fix buffer overflow in RNNOp > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12603 > >>>>>>>>>>>> > >>>>>>>>>>>> [MXNET-922] Fix memleak in profiler > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12499 > >>>>>>>>>>>> > >>>>>>>>>>>> Implement mkldnn convolution fusion and quantization (MXNet > >>> Graph > >>>>>>>>>>>> Optimization and Quantization based on subgraph and MKL-DNN > >>>>>>>>>> proposal > >>>>>>>>>>>> [4]) > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12530 > >>>>>>>>>>>> > >>>>>>>>>>>> Following items (test cases) should be already part of 1.3.0: > >>>>>>>>>>>> > >>>>>>>>>>>> [MXNET-486] Create CPP test for concat MKLDNN operator > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11371 > >>>>>>>>>>>> > >>>>>>>>>>>> [MXNET-489] MKLDNN Pool test > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11608 > >>>>>>>>>>>> > >>>>>>>>>>>> [MXNET-484] MKLDNN C++ test for LRN operator > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11831 > >>>>>>>>>>>> > >>>>>>>>>>>> [MXNET-546] Add unit test for MKLDNNSum > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11272 > >>>>>>>>>>>> > >>>>>>>>>>>> [MXNET-498] Test MKLDNN backward operators > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11232 > >>>>>>>>>>>> > >>>>>>>>>>>> [MXNET-500] Test cases improvement for MKLDNN on Gluon > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/10921 > >>>>>>>>>>>> > >>>>>>>>>>>> Set correct update on kvstore flag in dist_device_sync mode > >> (as > >>>>> part > >>>>>>>>>>>> of fixing [5]) > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12786 > >>>>>>>>>>>> > >>>>>>>>>>>> upgrade mshadow version > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12692 > >>>>>>>>>>>> But another PR will be used instead: > >>>>>>>>>>>> update mshadow > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12674 > >>>>>>>>>>>> > >>>>>>>>>>>> CudnnFind() usage improvements > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12804 > >>>>>>>>>>>> A critical CUDNN fix that reduces GPU memory consumption and > >>>>>>>>>>>> addresses this memory leak issue. This is an important fix to > >>>>>>>>> include > >>>>>>>>>>>> in 1.3.1 > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> From discussion about gluon toolkits: > >>>>>>>>>>>> > >>>>>>>>>>>> disable opencv threading for forked process > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12025 > >>>>>>>>>>>> > >>>>>>>>>>>> Fix lazy record io when used with dataloader and multi_worker > >>>> 0 > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12554 > >>>>>>>>>>>> > >>>>>>>>>>>> fix potential floating number overflow, enable float16 > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12118 > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> * Resolved issues > >>>>>>>>>>>> > >>>>>>>>>>>> MxNet 1.2.1–module get_outputs() > >>>>>>>>>>>> > >> https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-outputs/1882 > >>>>>>>>>>>> > >>>>>>>>>>>> As far as I can see from the comments the issue has been > >>>> resolved, > >>>>>>>>> no > >>>>>>>>>>>> actions need to be taken for this release. [7] is mentioned > >> in > >>>> this > >>>>>>>>>>>> regards, but I don't see any action points here either. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> I will start with help of Naveen port the mentioned PR's to > >> the > >>>>>>>>> 1.3.x > >>>>>>>>>>>> branch. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Best regards, > >>>>>>>>>>>> Anton > >>>>>>>>>>>> > >>>>>>>>>>>> [1] https://cwiki.apache.org/confluence/x/eZGzBQ > >>>>>>>>>>>> [2] > >>>>>>>>>>>> > >>>>>>>>> > >>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+f > >>>>>>>>>>>> or+next+MXNet+Release [3] > >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/issues/11849 > >>>>>>>>>>>> [4] > >>>>>>>>>>>> > >>>>>>>>>> > >>>>> > >> https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimiz > >>>>>>>>>>>> ation+and+Quantization+based+on+subgraph+and+MKL-DNN > >>>>>>>>>>>> [5] https://github.com/apache/incubator-mxnet/issues/12713 > >>>>>>>>>>>> [6] > >>>>>>>>>>>> https://github.com/apache/incubator- > >>>>>>>>>> mxnet/issues/12713#issuecomment-4 > >>>>>>>>>>>> 35773777 [7] > >>>> https://github.com/apache/incubator-mxnet/pull/11005 > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>> > >>>> > >>> > >> >
