Hi Tao,
Existing MXNet implementation doesn't support large tensors. MXNet NDArray
creation for tensors of sizes larger than 2^32 is only supported by enabling a
build flag for now. The purpose of this thread is to have the community provide
feedback on the design cwiki for *Large Tensor Support* in MXNet. The intension
is to make large tensor support as default feature in MXNet (in future) w/o any
performance impact so consumers do not have to build it from source.
-Rohit
On 5/18/19, 5:59 PM, "Lv, Tao A" <[email protected]> wrote:
Hi Rohit,
The existing MKL-DNN and its integration in MXNet should already support
*large tensor* which means the total number of elements (Prod(shape)) can
exceed INT_MAX. Feel free to me know if you find any issue when using MKL-DNN
operators with large tensors.
For large dimension size (shape[x]), MKL-DNN is going to support in its 1.0
release and will be released at the middle of year. But I'm not sure if MXNet
has plan to support that.
Thanks,
-tao
-----Original Message-----
From: Srivastava, Rohit Kumar [mailto:[email protected]]
Sent: Sunday, May 19, 2019 7:23 AM
To: [email protected]
Subject: Re: [RFC] Support for creation of Large Tensors in MXNet
Hi Tao,
There are already couple of operators implemented in MXNet that are
currently supporting Tensors with size over ~4.5 billion. In the meantime core
MXNet can move ahead with providing initial support for such large tensors so
MXNet customers can start using it.
Good to hear MKLDNN will provide support for such cases. Do you have a
timeline as to when this feature will be released ?
-Rohit
On 4/29/19, 7:18 PM, "Lv, Tao A" <[email protected]> wrote:
Thank you Lin! I would expect the current MKL-DNN implementation
already supports the scenario you mentioned here. Can be verified by this
issue: https://github.com/apache/incubator-mxnet/issues/13451
But as I said before, since we support flatten or reshape operators, so
it's possible for users to convert a tensor with large element size to a tensor
with large dimension size. It possibly will cause issue there.
To cover more cases, MKL-DNN is going to support INT64 dimension size
in its coming 1.0 major release.
-tao
-----Original Message-----
From: Lin Yuan [mailto:[email protected]]
Sent: Tuesday, April 30, 2019 12:56 AM
To: [email protected]
Subject: Re: [RFC] Support for creation of Large Tensors in MXNet
Tao,
- what's the max size of dimensionality? Which data type is used to
define dimensionality (ndims)?
We assume the max size of dimensionality is relatively small. Hence
`int` data type is used to define ndim
- what's the max size of each dimension? Which data type is used to
define dimension size (shape[x])?
Currently, we assume the max size of each dimension is not going to
exceed
2^31 in real applications. Hence the data type is `int32_t`
- what's the max size of total elements? Which data type is used to
define element size (Prod(shape))?
We assume the total number of elements in a tensor can be larger than
2^32 in some applications such as deep graph library. We use the data type
`int64_t` to represent the total element size. Currently due to performance
regression in some operators (such as transpose), we used a compiler flag to
set this data type to `int32_t` by default. Once we have ways to mitigate the
performance regression, we will set the default data type to `int64_t`, which
is part of the effort in this project that Rohit proposed.
What is the plan in MKLDNN to support large tensors? We may want to
coordinate the progress since many operators are using MKLDNN implementation in
CPU now.
Many Thanks,
Lin
On Sun, Apr 28, 2019 at 7:52 PM Lv, Tao A <[email protected]> wrote:
> Thank you for bringing this topic to dev, Rohit.
>
> Regarding large tensor, can you articulate:
> - what's the max size of dimensionality? Which data type is used to
> define dimensionality (ndims)?
> - what's the max size of each dimension? Which data type is used to
> define dimension size (shape[x])?
> - what's the max size of total elements? Which data type is used to
> define element size (Prod(shape))?
>
> For me, any of these three can be *large*.
>
> -----Original Message-----
> From: Srivastava, Rohit Kumar
> [mailto:[email protected]]
> Sent: Saturday, April 27, 2019 7:33 AM
> To: [email protected]
> Subject: [RFC] Support for creation of Large Tensors in MXNet
>
> Dear Community,
>
> Currently MXNet supports creation of Tensors containing up to 2^32
> elements. However there are cases where tensors of size over 5
billion
> is required
>
> We plan to support creation of large tensors on MXNet. A design
> proposal is ready for review:
> https://cwiki.apache.org/confluence/display/MXNET/Large+Tensor+Support
>
> We will appreciate any help and feedbacks from the community.
>
> Thank you!
>
> Rohit
>