tqchen commented on code in PR #169: URL: https://github.com/apache/tvm-ffi/pull/169#discussion_r2442406276
########## docs/guides/kernel_library_guide.rst: ########## @@ -0,0 +1,146 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +==================== +Kernel Library Guide +==================== + +This guide serves as a quick start for composing a kernel from scratch, or migrating a kernel from externel frameworks. It covers the core concepts in TVM FFI, such as tensor, stream. + +Tensor +====== + +Tensor is the most important input for a kernel libaray. In PyTorch C++ extensions, kernel library usually takes ``at::Tensor`` as tensor input. In TVM FFI, we introduce two types of tensor, ``ffi::Tensor`` and ``ffi::TensorView``. + +Tensor and TensorView +--------------------- + +Both ``ffi::Tensor`` and ``ffi::TensorView`` are designed to represent tensors in TVM FFI eco-system. The main difference is whether it is an owning tensor pointer. Review Comment: represent tensors from ML frameworks that interact with the TVM FFI ABI. ########## docs/guides/kernel_library_guide.rst: ########## @@ -0,0 +1,146 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +==================== +Kernel Library Guide +==================== + +This guide serves as a quick start for composing a kernel from scratch, or migrating a kernel from externel frameworks. It covers the core concepts in TVM FFI, such as tensor, stream. + +Tensor +====== + +Tensor is the most important input for a kernel libaray. In PyTorch C++ extensions, kernel library usually takes ``at::Tensor`` as tensor input. In TVM FFI, we introduce two types of tensor, ``ffi::Tensor`` and ``ffi::TensorView``. + +Tensor and TensorView +--------------------- + +Both ``ffi::Tensor`` and ``ffi::TensorView`` are designed to represent tensors in TVM FFI eco-system. The main difference is whether it is an owning tensor pointer. + +:ffi::Tensor: + ``ffi::Tensor`` is a completely onwing tensor pointer, pointing to TVM FFI tensor object. TVM FFI handles the lifetime of ``ffi::Tensor`` by retaining a strong reference. + +:ffi::TensorView: + ``ffi::TensorView`` is a light weight non-owning tnesor pointer, pointeing to a TVM FFI tensor or external tensor object. TVM FFI does not retain its reference. So users are responsible for ensuring the lifetime of tensor object to which the ``ffi::TensorView`` points. + +TVM FFI can automatically convert the input tensor at Python side, e.g. ``torch.Tensor``, to both ``ffi::Tensor`` or ``ffi::TensorView`` at C++ side, depends on the C++ function arguments. However, for more flexibility and better compatibility, we **recommand** to use ``ffi::TensorView`` in practice. Since some frameworks, like JAX, cannot provide strong referenced tensor, as ``ffi::Tensor`` expected. + +Tensor as Argument +------------------ + +Typically, we expect that all tensors are pre-allocated at Python side and passed in via TVM FFI, including the output tensor. And TVM FFI will convert them into ``ffi::TensorView`` at runtime. For the optional arguments, ``ffi::Optional`` is the best practice. Here is an example of a kernel definition at C++ side and calling at Python side. + +.. code-block:: c++ + + // Kernel definition + void func(ffi::TensorView input, ffi::Optional<ffi::Tensor> optional_input, ffi::TensorView output, ffi::TensorView workspace); + +.. code-block:: python + + # Kernel calling + input = torch.tensor(...) + output = torch.empty(...) + workspace = torch.empty(...) + func(input, None, output, workspace) + +Ideally, we expect the kernel function to have ``void`` as return type. However, if it is necessary to return the ``ffi::Tensor`` anyway, please pay attention to convert the output ``ffi::Tensor`` to original tensor type at Python side, like ``torch.from_dlpack``. Review Comment: the FFI automatically convert the returning tensor to torch.Tensor when input contains torch arguments ########## docs/guides/kernel_library_guide.rst: ########## @@ -0,0 +1,146 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +==================== +Kernel Library Guide +==================== + +This guide serves as a quick start for composing a kernel from scratch, or migrating a kernel from externel frameworks. It covers the core concepts in TVM FFI, such as tensor, stream. + +Tensor +====== + +Tensor is the most important input for a kernel libaray. In PyTorch C++ extensions, kernel library usually takes ``at::Tensor`` as tensor input. In TVM FFI, we introduce two types of tensor, ``ffi::Tensor`` and ``ffi::TensorView``. Review Comment: In TVM FFI, we support two types of tensor constructs, ``ffi::Tensor`` and ``ffi::TensorView`` that can be used to represent a tensor from machine learning frameworks ########## docs/guides/kernel_library_guide.rst: ########## @@ -0,0 +1,146 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +==================== +Kernel Library Guide +==================== + +This guide serves as a quick start for composing a kernel from scratch, or migrating a kernel from externel frameworks. It covers the core concepts in TVM FFI, such as tensor, stream. + +Tensor +====== + +Tensor is the most important input for a kernel libaray. In PyTorch C++ extensions, kernel library usually takes ``at::Tensor`` as tensor input. In TVM FFI, we introduce two types of tensor, ``ffi::Tensor`` and ``ffi::TensorView``. + +Tensor and TensorView +--------------------- + +Both ``ffi::Tensor`` and ``ffi::TensorView`` are designed to represent tensors in TVM FFI eco-system. The main difference is whether it is an owning tensor pointer. + +:ffi::Tensor: + ``ffi::Tensor`` is a completely onwing tensor pointer, pointing to TVM FFI tensor object. TVM FFI handles the lifetime of ``ffi::Tensor`` by retaining a strong reference. + +:ffi::TensorView: + ``ffi::TensorView`` is a light weight non-owning tnesor pointer, pointeing to a TVM FFI tensor or external tensor object. TVM FFI does not retain its reference. So users are responsible for ensuring the lifetime of tensor object to which the ``ffi::TensorView`` points. + +TVM FFI can automatically convert the input tensor at Python side, e.g. ``torch.Tensor``, to both ``ffi::Tensor`` or ``ffi::TensorView`` at C++ side, depends on the C++ function arguments. However, for more flexibility and better compatibility, we **recommand** to use ``ffi::TensorView`` in practice. Since some frameworks, like JAX, cannot provide strong referenced tensor, as ``ffi::Tensor`` expected. Review Comment: XLA Buffer is weak reference, we can "fake" a strong reference by `pass_owning_tensor=True` now, but the tensor cannot be retained or returned. ########## docs/guides/kernel_library_guide.rst: ########## @@ -0,0 +1,146 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +==================== +Kernel Library Guide +==================== + +This guide serves as a quick start for composing a kernel from scratch, or migrating a kernel from externel frameworks. It covers the core concepts in TVM FFI, such as tensor, stream. + +Tensor +====== + +Tensor is the most important input for a kernel libaray. In PyTorch C++ extensions, kernel library usually takes ``at::Tensor`` as tensor input. In TVM FFI, we introduce two types of tensor, ``ffi::Tensor`` and ``ffi::TensorView``. + +Tensor and TensorView +--------------------- + +Both ``ffi::Tensor`` and ``ffi::TensorView`` are designed to represent tensors in TVM FFI eco-system. The main difference is whether it is an owning tensor pointer. + +:ffi::Tensor: + ``ffi::Tensor`` is a completely onwing tensor pointer, pointing to TVM FFI tensor object. TVM FFI handles the lifetime of ``ffi::Tensor`` by retaining a strong reference. + +:ffi::TensorView: + ``ffi::TensorView`` is a light weight non-owning tnesor pointer, pointeing to a TVM FFI tensor or external tensor object. TVM FFI does not retain its reference. So users are responsible for ensuring the lifetime of tensor object to which the ``ffi::TensorView`` points. + +TVM FFI can automatically convert the input tensor at Python side, e.g. ``torch.Tensor``, to both ``ffi::Tensor`` or ``ffi::TensorView`` at C++ side, depends on the C++ function arguments. However, for more flexibility and better compatibility, we **recommand** to use ``ffi::TensorView`` in practice. Since some frameworks, like JAX, cannot provide strong referenced tensor, as ``ffi::Tensor`` expected. + +Tensor as Argument +------------------ + +Typically, we expect that all tensors are pre-allocated at Python side and passed in via TVM FFI, including the output tensor. And TVM FFI will convert them into ``ffi::TensorView`` at runtime. For the optional arguments, ``ffi::Optional`` is the best practice. Here is an example of a kernel definition at C++ side and calling at Python side. + +.. code-block:: c++ + + // Kernel definition + void func(ffi::TensorView input, ffi::Optional<ffi::Tensor> optional_input, ffi::TensorView output, ffi::TensorView workspace); + +.. code-block:: python + + # Kernel calling + input = torch.tensor(...) + output = torch.empty(...) + workspace = torch.empty(...) + func(input, None, output, workspace) + +Ideally, we expect the kernel function to have ``void`` as return type. However, if it is necessary to return the ``ffi::Tensor`` anyway, please pay attention to convert the output ``ffi::Tensor`` to original tensor type at Python side, like ``torch.from_dlpack``. + +Tensor Attributes +----------------- + +For the sake of convenience, ``ffi::TensorView`` and ``ffi::Tensor`` align the following attributes retrieval mehtods to ``at::Tensor`` interface: + +``dim``, ``sizes``, ``size``, ``strides``, ``stride``, ``numel``, ``data_ptr``, ``device``, ``is_contiguous`` + +:DLDataType: + In TVM FFI, tensor data types are stored as ``DLDataType`` which is defined by DLPack protocol. + +:DLDevice: + In TVM FFI, tensor device information are stored as ``DLDevice`` which is defined by DLPack protocol. + +:ShapeView: + In TVM FFI, tensor shapes and strides attributes retrieval are returned as ``ShapeView``. It is an iterate-able data structure storing the shapes or strides data as ``int64_t`` array. + +Tensor Allocation +----------------- + +Sometimes we have to allocate the tensor within the kernel. TVM FFI provides several methods to allocate tensors. + +:FromNDAlloc: + ``FromNDAlloc`` is the most basic tensor allocator. Besides of the basic attributes like shape, data type and device, it requires a custom allocator struct to handle the allocation and free. The allocator must consist of ``void AllocData(DLTensor*)`` and ``void FreeData(DLTensor*)`` methods. Here are the examples of CPU, CUDA and NVSHMEM allocation: + + .. code-block:: c++ + + // CPU Allocator + struct CPUNDAlloc { + void AllocData(DLTensor* tensor) { tensor->data = malloc(ffi::GetDataSize(*tensor)); } + void FreeData(DLTensor* tensor) { free(tensor->data); } + }; + + // CUDA Allocator + struct CUDANDAlloc { + void AllocData(DLTensor* tensor) { + size_t data_size = ffi::GetDataSize(*tensor); + void* ptr = nullptr; + cudaError_t err = cudaMalloc(&ptr, data_size); + TVM_FFI_ICHECK_EQ(err, cudaSuccess) << "cudaMalloc failed: " << cudaGetErrorString(err); + tensor->data = ptr; + } + void FreeData(DLTensor* tensor) { + if (tensor->data != nullptr) { + cudaError_t err = cudaFree(tensor->data); + TVM_FFI_ICHECK_EQ(err, cudaSuccess) << "cudaFree failed: " << cudaGetErrorString(err); + tensor->data = nullptr; + } + } + }; + + // NVSHMEM Allocator + struct NVSHMEMNDAlloc { + void AllocData(DLTensor* tensor) { + size_t size = tvm::ffi::GetDataSize(*tensor); + tensor->data = nvshmem_malloc(size); + TVM_FFI_ICHECK_NE(tensor->data, nullptr) << "nvshmem_malloc failed. size: " << size; + } + void FreeData(DLTensor* tensor) { nvshmem_free(tensor->data); } + }; + + // Allocator usage + ffi::Tensor cpu_tensor = ffi::Tensor::FromNDAlloc(CPUNDAlloc(), ...); + ffi::Tensor cuda_tensor = ffi::Tensor::FromNDAlloc(CUDANDAlloc(), ...); + ffi::Tensor nvshmem_tensor = ffi::Tensor::FromNDAlloc(NVSHMEMNDAlloc(), ...); + +:FromEnvAlloc: + For the case of using external tensor allocator, like``at::empty`` in PyTorch C++ extensions, ``FromEnvAlloc`` is the better choice. Besides of the basic attributes like shape, data type and device, it requires a thread-local environmental allocator ``TVMFFIEnvTensorAlloc``. ``TVMFFIEnvTensorAlloc`` gets the global tensor allocator in the current context. The context can be switched based on the arguments of the kernel. + +:FromDLPack: + ``FromDLPack`` enables creating ``ffi::Tensor`` from ``DLManagedTensor*``. + +:FromDLPackVersioned: + ``FromDLPackVersioned`` enables creating ``ffi::Tensor`` from ``DLManagedTensorVersioned*``. + +Stream +====== + +TVM FFI maintains the stream context per device type and index. Use ``TVMFFIEnvGetStream`` to get the current stream on device: Review Comment: When calling from frameworks like PyTorch, `TVMFFIEnvGetStream` will return the stream under the current torch stream context ########## docs/guides/kernel_library_guide.rst: ########## @@ -0,0 +1,146 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +==================== +Kernel Library Guide +==================== + +This guide serves as a quick start for composing a kernel from scratch, or migrating a kernel from externel frameworks. It covers the core concepts in TVM FFI, such as tensor, stream. + +Tensor +====== + +Tensor is the most important input for a kernel libaray. In PyTorch C++ extensions, kernel library usually takes ``at::Tensor`` as tensor input. In TVM FFI, we introduce two types of tensor, ``ffi::Tensor`` and ``ffi::TensorView``. + +Tensor and TensorView +--------------------- + +Both ``ffi::Tensor`` and ``ffi::TensorView`` are designed to represent tensors in TVM FFI eco-system. The main difference is whether it is an owning tensor pointer. + +:ffi::Tensor: + ``ffi::Tensor`` is a completely onwing tensor pointer, pointing to TVM FFI tensor object. TVM FFI handles the lifetime of ``ffi::Tensor`` by retaining a strong reference. + +:ffi::TensorView: + ``ffi::TensorView`` is a light weight non-owning tnesor pointer, pointeing to a TVM FFI tensor or external tensor object. TVM FFI does not retain its reference. So users are responsible for ensuring the lifetime of tensor object to which the ``ffi::TensorView`` points. Review Comment: non-owning view of an existing tensor. It is backed by DLTensor structure in DLPack ########## docs/guides/kernel_library_guide.rst: ########## @@ -0,0 +1,146 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +==================== +Kernel Library Guide +==================== + +This guide serves as a quick start for composing a kernel from scratch, or migrating a kernel from externel frameworks. It covers the core concepts in TVM FFI, such as tensor, stream. + +Tensor +====== + +Tensor is the most important input for a kernel libaray. In PyTorch C++ extensions, kernel library usually takes ``at::Tensor`` as tensor input. In TVM FFI, we introduce two types of tensor, ``ffi::Tensor`` and ``ffi::TensorView``. + +Tensor and TensorView +--------------------- + +Both ``ffi::Tensor`` and ``ffi::TensorView`` are designed to represent tensors in TVM FFI eco-system. The main difference is whether it is an owning tensor pointer. + +:ffi::Tensor: + ``ffi::Tensor`` is a completely onwing tensor pointer, pointing to TVM FFI tensor object. TVM FFI handles the lifetime of ``ffi::Tensor`` by retaining a strong reference. + +:ffi::TensorView: + ``ffi::TensorView`` is a light weight non-owning tnesor pointer, pointeing to a TVM FFI tensor or external tensor object. TVM FFI does not retain its reference. So users are responsible for ensuring the lifetime of tensor object to which the ``ffi::TensorView`` points. + +TVM FFI can automatically convert the input tensor at Python side, e.g. ``torch.Tensor``, to both ``ffi::Tensor`` or ``ffi::TensorView`` at C++ side, depends on the C++ function arguments. However, for more flexibility and better compatibility, we **recommand** to use ``ffi::TensorView`` in practice. Since some frameworks, like JAX, cannot provide strong referenced tensor, as ``ffi::Tensor`` expected. Review Comment: we **recommend* to use ``ffi::TensorView`` when possible, that helps us to support more cases, including cases where only view but not strong reference are passed. It is also more lightweight. ########## docs/guides/kernel_library_guide.rst: ########## @@ -0,0 +1,146 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +==================== +Kernel Library Guide +==================== + +This guide serves as a quick start for composing a kernel from scratch, or migrating a kernel from externel frameworks. It covers the core concepts in TVM FFI, such as tensor, stream. Review Comment: quick start for shipping python version and framework agnostic kernel libraries with TVM FFI ########## docs/guides/kernel_library_guide.rst: ########## @@ -0,0 +1,146 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +==================== +Kernel Library Guide +==================== + +This guide serves as a quick start for composing a kernel from scratch, or migrating a kernel from externel frameworks. It covers the core concepts in TVM FFI, such as tensor, stream. + +Tensor +====== + +Tensor is the most important input for a kernel libaray. In PyTorch C++ extensions, kernel library usually takes ``at::Tensor`` as tensor input. In TVM FFI, we introduce two types of tensor, ``ffi::Tensor`` and ``ffi::TensorView``. + +Tensor and TensorView +--------------------- + +Both ``ffi::Tensor`` and ``ffi::TensorView`` are designed to represent tensors in TVM FFI eco-system. The main difference is whether it is an owning tensor pointer. + +:ffi::Tensor: + ``ffi::Tensor`` is a completely onwing tensor pointer, pointing to TVM FFI tensor object. TVM FFI handles the lifetime of ``ffi::Tensor`` by retaining a strong reference. + +:ffi::TensorView: + ``ffi::TensorView`` is a light weight non-owning tnesor pointer, pointeing to a TVM FFI tensor or external tensor object. TVM FFI does not retain its reference. So users are responsible for ensuring the lifetime of tensor object to which the ``ffi::TensorView`` points. + +TVM FFI can automatically convert the input tensor at Python side, e.g. ``torch.Tensor``, to both ``ffi::Tensor`` or ``ffi::TensorView`` at C++ side, depends on the C++ function arguments. However, for more flexibility and better compatibility, we **recommand** to use ``ffi::TensorView`` in practice. Since some frameworks, like JAX, cannot provide strong referenced tensor, as ``ffi::Tensor`` expected. + +Tensor as Argument +------------------ + +Typically, we expect that all tensors are pre-allocated at Python side and passed in via TVM FFI, including the output tensor. And TVM FFI will convert them into ``ffi::TensorView`` at runtime. For the optional arguments, ``ffi::Optional`` is the best practice. Here is an example of a kernel definition at C++ side and calling at Python side. + +.. code-block:: c++ + + // Kernel definition + void func(ffi::TensorView input, ffi::Optional<ffi::Tensor> optional_input, ffi::TensorView output, ffi::TensorView workspace); + +.. code-block:: python + + # Kernel calling + input = torch.tensor(...) + output = torch.empty(...) + workspace = torch.empty(...) + func(input, None, output, workspace) + +Ideally, we expect the kernel function to have ``void`` as return type. However, if it is necessary to return the ``ffi::Tensor`` anyway, please pay attention to convert the output ``ffi::Tensor`` to original tensor type at Python side, like ``torch.from_dlpack``. + +Tensor Attributes +----------------- + +For the sake of convenience, ``ffi::TensorView`` and ``ffi::Tensor`` align the following attributes retrieval mehtods to ``at::Tensor`` interface: + +``dim``, ``sizes``, ``size``, ``strides``, ``stride``, ``numel``, ``data_ptr``, ``device``, ``is_contiguous`` + +:DLDataType: + In TVM FFI, tensor data types are stored as ``DLDataType`` which is defined by DLPack protocol. + +:DLDevice: + In TVM FFI, tensor device information are stored as ``DLDevice`` which is defined by DLPack protocol. + +:ShapeView: + In TVM FFI, tensor shapes and strides attributes retrieval are returned as ``ShapeView``. It is an iterate-able data structure storing the shapes or strides data as ``int64_t`` array. + +Tensor Allocation +----------------- + +Sometimes we have to allocate the tensor within the kernel. TVM FFI provides several methods to allocate tensors. + +:FromNDAlloc: + ``FromNDAlloc`` is the most basic tensor allocator. Besides of the basic attributes like shape, data type and device, it requires a custom allocator struct to handle the allocation and free. The allocator must consist of ``void AllocData(DLTensor*)`` and ``void FreeData(DLTensor*)`` methods. Here are the examples of CPU, CUDA and NVSHMEM allocation: + + .. code-block:: c++ + + // CPU Allocator + struct CPUNDAlloc { + void AllocData(DLTensor* tensor) { tensor->data = malloc(ffi::GetDataSize(*tensor)); } + void FreeData(DLTensor* tensor) { free(tensor->data); } + }; + + // CUDA Allocator + struct CUDANDAlloc { + void AllocData(DLTensor* tensor) { + size_t data_size = ffi::GetDataSize(*tensor); + void* ptr = nullptr; + cudaError_t err = cudaMalloc(&ptr, data_size); + TVM_FFI_ICHECK_EQ(err, cudaSuccess) << "cudaMalloc failed: " << cudaGetErrorString(err); + tensor->data = ptr; + } + void FreeData(DLTensor* tensor) { + if (tensor->data != nullptr) { + cudaError_t err = cudaFree(tensor->data); + TVM_FFI_ICHECK_EQ(err, cudaSuccess) << "cudaFree failed: " << cudaGetErrorString(err); + tensor->data = nullptr; + } + } + }; + + // NVSHMEM Allocator + struct NVSHMEMNDAlloc { + void AllocData(DLTensor* tensor) { + size_t size = tvm::ffi::GetDataSize(*tensor); + tensor->data = nvshmem_malloc(size); + TVM_FFI_ICHECK_NE(tensor->data, nullptr) << "nvshmem_malloc failed. size: " << size; + } + void FreeData(DLTensor* tensor) { nvshmem_free(tensor->data); } + }; + + // Allocator usage + ffi::Tensor cpu_tensor = ffi::Tensor::FromNDAlloc(CPUNDAlloc(), ...); + ffi::Tensor cuda_tensor = ffi::Tensor::FromNDAlloc(CUDANDAlloc(), ...); + ffi::Tensor nvshmem_tensor = ffi::Tensor::FromNDAlloc(NVSHMEMNDAlloc(), ...); + +:FromEnvAlloc: + For the case of using external tensor allocator, like``at::empty`` in PyTorch C++ extensions, ``FromEnvAlloc`` is the better choice. Besides of the basic attributes like shape, data type and device, it requires a thread-local environmental allocator ``TVMFFIEnvTensorAlloc``. ``TVMFFIEnvTensorAlloc`` gets the global tensor allocator in the current context. The context can be switched based on the arguments of the kernel. Review Comment: I agree, let us move `FromEnvAlloc` first, we encourage `FromEnvAlloc` when possible, since it makes use of the allocator from framework tensor allocator. Give an example code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
