Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

Naveen Swamy Tue, 06 Nov 2018 07:36:00 -0800

Please note that this is a patch release(1.3.1) to address critical bugs!, For 
everything else please wait for 1.4.0 which is planned very shortly after 1.3.1


> On Nov 6, 2018, at 7:17 AM, Anton Chernov <[email protected]> wrote:
> 
> The following PR's have been created so far:
> 
> Infer dtype in SymbolBlock import from input symbol (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13117
> 
> [MXNET-953] Fix oob memory read (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13118
> 
> [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13119
> 
> [MXNET-922] Fix memleak in profiler (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13120
> 
> Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13121
> 
> update mshadow (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13122
> 
> CudnnFind() usage improvements (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13123
> 
> Fix lazy record io when used with dataloader and multi_worker > 0 (v1.3.x)
> https://github.com/apache/incubator-mxnet/pull/13124
> 
> 
> As stated previously I would be rather opposed to have following PR's it in
> the patch release:
> 
> Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
> https://github.com/apache/incubator-mxnet/pull/13129
> 
> sample_like operators (#13034) v1.3.x
> https://github.com/apache/incubator-mxnet/pull/13130
> 
> 
> Best
> Anton
> 
> вт, 6 нояб. 2018 г. в 16:06, Anton Chernov <[email protected]>:
> 
>> Hi Haibin,
>> 
>> I have a few comments regarding the proposed performance improvement
>> changes.
>> 
>> CUDNN support for LSTM with projection & clipping
>> https://github.com/apache/incubator-mxnet/pull/13056
>> 
>> There is no doubt that this change brings value, but I don't see it as a
>> critical bug fix. I would rather leave it for the next major release.
>> 
>> sample_like operators
>> https://github.com/apache/incubator-mxnet/pull/13034
>> 
>> Even if it's related to performance, this is an addition of functionality
>> and I would also push this to be in the next major release only.
>> 
>> 
>> Best
>> Anton
>> 
>> 
>> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov <[email protected]>:
>> 
>>> Hi Patric,
>>> 
>>> This change was listed in the 'PR candidates suggested for consideration
>>> for v1.3.1 patch release' section [1].
>>> 
>>> You are right, I also think that this is not a critical hotfix change
>>> that should be included into the 1.3.1 patch release.
>>> 
>>> Thus I'm not making any further efforts to bring it in.
>>> 
>>> Best
>>> Anton
>>> 
>>> [1]
>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
>>> 
>>> 
>>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric <[email protected]>:
>>> 
>>>> Hi Anton,
>>>> 
>>>> Thanks for looking into the MKL-DNN PR.
>>>> 
>>>> As my understanding of cwiki (
>>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
>>>> ),
>>>> these features will go into 1.4 rather than patch release of 1.3.1.
>>>> 
>>>> Feel free to correct me :)
>>>> 
>>>> Thanks,
>>>> 
>>>> --Patric
>>>> 
>>>>> -----Original Message-----
>>>>> From: Anton Chernov [mailto:[email protected]]
>>>>> Sent: Tuesday, November 6, 2018 3:11 AM
>>>>> To: [email protected]
>>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch
>>>>> release
>>>>> 
>>>>> It seems that there is a problem porting following changes to the
>>>> v1.3.x
>>>>> release branch:
>>>>> 
>>>>> Implement mkldnn convolution fusion and quantization
>>>>> https://github.com/apache/incubator-mxnet/pull/12530
>>>>> 
>>>>> MKL-DNN Quantization Examples and README
>>>>> https://github.com/apache/incubator-mxnet/pull/12808
>>>>> 
>>>>> The bases are different.
>>>>> 
>>>>> I would need help from authors of these changes to make a backport PR.
>>>>> 
>>>>> @ZhennanQin, @xinyu-intel would you be able to assist me and create the
>>>>> corresponding PR's?
>>>>> 
>>>>> Without proper history and domain knowledge I would not be able to
>>>> create
>>>>> them by my own in reasonable amount of time, I'm afraid.
>>>>> 
>>>>> Best regards,
>>>>> Anton
>>>>> 
>>>>> пн, 5 нояб. 2018 г. в 19:45, Anton Chernov <[email protected]>:
>>>>> 
>>>>>> 
>>>>>> As part of:
>>>>>> 
>>>>>> Implement mkldnn convolution fusion and quantization
>>>>>> https://github.com/apache/incubator-mxnet/pull/12530
>>>>>> 
>>>>>> I propose to add the examples and documentation PR as well:
>>>>>> 
>>>>>> MKL-DNN Quantization Examples and README
>>>>>> https://github.com/apache/incubator-mxnet/pull/12808
>>>>>> 
>>>>>> 
>>>>>> Best regards,
>>>>>> Anton
>>>>>> 
>>>>>> пн, 5 нояб. 2018 г. в 19:02, Anton Chernov <[email protected]>:
>>>>>> 
>>>>>>> Dear MXNet community,
>>>>>>> 
>>>>>>> I will be the release manager for the upcoming 1.3.1 patch release.
>>>>>>> Naveen will be co-managing the release and providing help from the
>>>>>>> committers side.
>>>>>>> 
>>>>>>> The following dates have been set:
>>>>>>> 
>>>>>>> Code Freeze: 31st October 2018
>>>>>>> Release published: 13th November 2018
>>>>>>> 
>>>>>>> Release notes have been drafted here [1].
>>>>>>> 
>>>>>>> 
>>>>>>> * Known issues
>>>>>>> 
>>>>>>> Update MKL-DNN dependency
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12953
>>>>>>> 
>>>>>>> This PR hasn't been merged even to master yet. Requires additional
>>>>>>> discussion and merge.
>>>>>>> 
>>>>>>> distributed kvstore bug in MXNet
>>>>>>> https://github.com/apache/incubator-mxnet/issues/12713
>>>>>>> 
>>>>>>>> When distributed kvstore is used, by default gluon.Trainer doesn't
>>>>>>>> work
>>>>>>> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be
>>>>>>> more specific, the trainer updates once per GPU, the LRScheduler
>>>>>>> object is shared across GPUs and get a wrong update count.
>>>>>>> 
>>>>>>> This needs to be fixed. [6]
>>>>>>> 
>>>>>>> 
>>>>>>> * Changes
>>>>>>> 
>>>>>>> The following changes will be ported to the release branch, per [2]:
>>>>>>> 
>>>>>>> Infer dtype in SymbolBlock import from input symbol [3]
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12412
>>>>>>> 
>>>>>>> [MXNET-953] Fix oob memory read
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12631
>>>>>>> 
>>>>>>> [MXNET-969] Fix buffer overflow in RNNOp
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12603
>>>>>>> 
>>>>>>> [MXNET-922] Fix memleak in profiler
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12499
>>>>>>> 
>>>>>>> Implement mkldnn convolution fusion and quantization (MXNet Graph
>>>>>>> Optimization and Quantization based on subgraph and MKL-DNN
>>>>> proposal
>>>>>>> [4])
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12530
>>>>>>> 
>>>>>>> Following items (test cases) should be already part of 1.3.0:
>>>>>>> 
>>>>>>> [MXNET-486] Create CPP test for concat MKLDNN operator
>>>>>>> https://github.com/apache/incubator-mxnet/pull/11371
>>>>>>> 
>>>>>>> [MXNET-489] MKLDNN Pool test
>>>>>>> https://github.com/apache/incubator-mxnet/pull/11608
>>>>>>> 
>>>>>>> [MXNET-484] MKLDNN C++ test for LRN operator
>>>>>>> https://github.com/apache/incubator-mxnet/pull/11831
>>>>>>> 
>>>>>>> [MXNET-546] Add unit test for MKLDNNSum
>>>>>>> https://github.com/apache/incubator-mxnet/pull/11272
>>>>>>> 
>>>>>>> [MXNET-498] Test MKLDNN backward operators
>>>>>>> https://github.com/apache/incubator-mxnet/pull/11232
>>>>>>> 
>>>>>>> [MXNET-500] Test cases improvement for MKLDNN on Gluon
>>>>>>> https://github.com/apache/incubator-mxnet/pull/10921
>>>>>>> 
>>>>>>> Set correct update on kvstore flag in dist_device_sync mode (as part
>>>>>>> of fixing [5])
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12786
>>>>>>> 
>>>>>>> upgrade mshadow version
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12692
>>>>>>> But another PR will be used instead:
>>>>>>> update mshadow
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12674
>>>>>>> 
>>>>>>> CudnnFind() usage improvements
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12804
>>>>>>> A critical CUDNN fix that reduces GPU memory consumption and
>>>>>>> addresses this memory leak issue. This is an important fix to
>>>> include
>>>>>>> in 1.3.1
>>>>>>> 
>>>>>>> 
>>>>>>> From discussion about gluon toolkits:
>>>>>>> 
>>>>>>> disable opencv threading for forked process
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12025
>>>>>>> 
>>>>>>> Fix lazy record io when used with dataloader and multi_worker > 0
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12554
>>>>>>> 
>>>>>>> fix potential floating number overflow, enable float16
>>>>>>> https://github.com/apache/incubator-mxnet/pull/12118
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> * Resolved issues
>>>>>>> 
>>>>>>> MxNet 1.2.1–module get_outputs()
>>>>>>> https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-outputs/1882
>>>>>>> 
>>>>>>> As far as I can see from the comments the issue has been resolved,
>>>> no
>>>>>>> actions need to be taken for this release. [7] is mentioned in this
>>>>>>> regards, but I don't see any action points here either.
>>>>>>> 
>>>>>>> 
>>>>>>> I will start with help of Naveen port the mentioned PR's to the
>>>> 1.3.x
>>>>>>> branch.
>>>>>>> 
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Anton
>>>>>>> 
>>>>>>> [1] https://cwiki.apache.org/confluence/x/eZGzBQ
>>>>>>> [2]
>>>>>>> 
>>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+f
>>>>>>> or+next+MXNet+Release [3]
>>>>>>> https://github.com/apache/incubator-mxnet/issues/11849
>>>>>>> [4]
>>>>>>> 
>>>>> https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimiz
>>>>>>> ation+and+Quantization+based+on+subgraph+and+MKL-DNN
>>>>>>> [5] https://github.com/apache/incubator-mxnet/issues/12713
>>>>>>> [6]
>>>>>>> https://github.com/apache/incubator-
>>>>> mxnet/issues/12713#issuecomment-4
>>>>>>> 35773777 [7] https://github.com/apache/incubator-mxnet/pull/11005
>>>>>>> 
>>>>>>> 
>>>> 
>>>

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

Reply via email to