gemini-code-assist[bot] commented on code in PR #372: URL: https://github.com/apache/tvm-ffi/pull/372#discussion_r2655785173
########## docs/concepts/tensor.rst: ########## @@ -0,0 +1,486 @@ +.. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +Tensor and DLPack +================= + +At runtime, TVM-FFI often needs to accept tensors from many sources: + +* Frameworks (e.g. PyTorch, JAX) via :py:meth:`array_api.array.__dlpack__`; +* C/C++ callers passing :c:struct:`DLTensor* <DLTensor>`; +* Tensors allocated by a library but managed by TVM-FFI itself. + +TVM-FFI standardizes on **DLPack as the lingua franca**: tensors are +built on top of DLPack structs with additional C++ convenience methods +and minimal extensions for ownership management. + +.. tip:: + + Prefer :cpp:class:`tvm::ffi::TensorView` or :cpp:class:`tvm::ffi::Tensor` in C++ code; + they provide safer and more convenient abstractions over raw DLPack structs. + + +This tutorial is organized as follows: + +* **Tensor Classes**: introduces what tensor types are provided, and which one you should use. +* **Conversion between TVMFFIAny**: how tensors flow across ABI boundaries. +* **Tensor APIs**: the most important tensor APIs you will use, including allocation and stream handling. + +Glossary +-------- + +DLPack + A cross-library tensor interchange standard defined in the small C header ``dlpack.h``. + It defines pure C data structures for describing n-dimensional arrays and their memory layout, + including :c:struct:`DLTensor`, :c:struct:`DLManagedTensorVersioned`, :c:struct:`DLDataType`, + :c:struct:`DLDevice`, and related types. + +View (non-owning) + A "header" that describes a tensor but does not own its memory. When a consumer + receives a view, it must respect that the producer owns the underlying storage and controls its + lifetime. The view is valid only while the producer guarantees it remains valid. + +Managed object (owning) + An object that includes lifetime management, using reference counting or a cleanup callback + mechanism. This establishes a contract between producer and consumer about when the consumer's ownership ends. + +.. note:: + + As a loose analogy, think of **view** vs. **managed** as similar to + ``T*`` (raw pointer) vs. ``std::shared_ptr<T>`` (reference-counted pointer) in C++. + +Tensor Classes +-------------- + +This section defines each tensor type you will encounter in the TVM-FFI C++ API and explains the +*intended* usage. Exact C layout details are covered later in :ref:`layout-and-conversion`. + +.. tip:: + + On the Python side, only :py:class:`tvm_ffi.Tensor` exists. It strictly follows DLPack semantics for interop and can be converted to PyTorch via :py:func:`torch.from_dlpack`. + + +DLPack Tensors +~~~~~~~~~~~~~~ + +DLPack tensors come in two main flavors: + +*Non-owning* object, :c:struct:`DLTensor` + The tensor descriptor is a **view** of the underlying data. + It describes the device the tensor lives on, its shape, dtype, and data pointer. It does not own the underlying data. + +*Owning* object, :c:struct:`DLManagedTensorVersioned`, or its legacy counterpart :c:struct:`DLManagedTensor` + It is a **managed** variant that wraps a :c:struct:`DLTensor` descriptor with additional fields. + Notably, it includes a ``deleter`` callback that releases ownership when the consumer is done with the tensor, + and an opaque ``manager_ctx`` handle used by the producer to store additional context. + +TVM-FFI Tensors +~~~~~~~~~~~~~~~ + +Similarly, TVM-FFI defines two main tensor types in C++: + +*Non-owning* object, :cpp:class:`tvm::ffi::TensorView` + A thin C++ wrapper around :c:struct:`DLTensor` for inspecting metadata and accessing the data pointer. + It is designed for **kernel authors** to inspect metadata and access the underlying data pointer during a call, + without taking ownership of the tensor's memory. Being a **view** also means you must ensure the backing tensor remains valid while you use it. + +*Owning* object, :cpp:class:`tvm::ffi::TensorObj` and :cpp:class:`tvm::ffi::Tensor` + :cpp:class:`Tensor <tvm::ffi::Tensor>`, similar to ``std::shared_ptr<TensorObj>``, is the managed class to hold heap-allocated + :cpp:class:`TensorObj <tvm::ffi::TensorObj>`. Once the reference count drops to zero, the cleanup logic deallocates the descriptor + and releases ownership of the underlying data buffer. + + +.. note:: + + - For handwritten C++, always use TVM-FFI tensors over DLPack's raw C tensors. + + - For compiler development, DLPack's raw C tensors are recommended because C is easier to target from codegen. + +The owning :cpp:class:`Tensor <tvm::ffi::Tensor>` is the recommended interface for passing around managed tensors. +Use owning tensors when you need one or more of the following: + +* return a tensor from a function across ABI, which will be converted to :cpp:class:`tvm::ffi::Any`; +* allocate an output tensor as the producer, and hand it to a kernel consumer; +* store a tensor in a long-lived object. + +.. admonition:: :cpp:class:`TensorObj <tvm::ffi::TensorObj>` vs :cpp:class:`Tensor <tvm::ffi::Tensor>` + :class: hint + + :cpp:class:`Tensor <tvm::ffi::Tensor>` is an intrusive pointer of a heap-allocated :cpp:class:`TensorObj <tvm::ffi::TensorObj>`. + As an analogy to ``std::shared_ptr``, think of + + .. code-block:: cpp + + using Tensor = std::shared_ptr<TensorObj>; + + You can convert between the two types: + + - :cpp:func:`Tensor::get() <tvm::ffi::Tensor::get>` converts it to :cpp:class:`TensorObj* <tvm::ffi::TensorObj>`. + - :cpp:func:`GetRef\<Tensor\> <tvm::ffi::GetRef>` converts a :cpp:class:`TensorObj* <tvm::ffi::TensorObj>` back to :cpp:class:`Tensor <tvm::ffi::Tensor>`. + +.. _layout-and-conversion: + +Tensor Layouts +~~~~~~~~~~~~~~ + +:ref:`Figure 1 <fig:layout-tensor>` summarizes the layout relationships among DLPack tensors and TVM-FFI tensors. +All tensor classes are POD-like; :cpp:class:`tvm::ffi::TensorObj` is also a standard TVM-FFI object, typically +heap-allocated and reference-counted. + +.. figure:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/tvm-ffi/tensor-layout.png + :alt: Layout of DLPack Tensors and TVM-FFI Tensors + :align: center + :name: fig:layout-tensor + + Figure 1. Layout specification of DLPack tensors and TVM-FFI tensors. All the tensor types share :c:struct:`DLTensor` as the common descriptor, while carrying different metadata and ownership semantics. + +As demonstrated in the figure, all tensor classes share :c:struct:`DLTensor` as the common descriptor. +In particular, + +- :c:struct:`DLTensor` and :cpp:class:`TensorView <tvm::ffi::TensorView>` share the exact same memory layout. +- :c:struct:`DLManagedTensorVersioned` and :cpp:class:`TensorObj <tvm::ffi::TensorObj>` both have a deleter + callback to manage the lifetime of the underlying data buffer, while :c:struct:`DLTensor` and :cpp:class:`TensorView <tvm::ffi::TensorView>` do not. +- Compared with :cpp:class:`TensorView <tvm::ffi::TensorView>`, :cpp:class:`TensorObj <tvm::ffi::TensorObj>` + has an extra TVM-FFI object header, making it reference-countable via the standard managed reference :cpp:class:`Tensor <tvm::ffi::Tensor>`. + +What Tensor is not +~~~~~~~~~~~~~~~~~~ + +TVM-FFI is not a tensor library. While it presents a unified representation for tensors, +it does not provide any of the following: + +* kernels, such as vector addition, matrix multiplication; +* host-device copy or synchronization primitives; +* advanced indexing or slicing; +* automatic differentiation or computational graph support. + +Conversion between :cpp:class:`TVMFFIAny` +----------------------------------------- + +At the stable C ABI boundary, TVM-FFI passes values using an "Any-like" carrier, often referred +to as :cpp:class:`Any <tvm::ffi::Any>` (owning) or :cpp:class:`AnyView <tvm::ffi::AnyView>` (non-owning). +These are 128-bit tagged unions derived from :cpp:class:`TVMFFIAny` that contain: + +* a :cpp:member:`type_index <TVMFFIAny::type_index>` that indicates the type of the payload, and +* a union payload that may contain: + + * A1. Primitive values, such as integers, floats, enums, raw pointers, or + * A2. TVM-FFI object handles, which are reference-counted pointers. + +Specifically for tensors stored in :cpp:class:`Any <tvm::ffi::Any>` or :cpp:class:`AnyView <tvm::ffi::AnyView>`, +there are two possible representations: + +* Non-owning views as A1 (primitive values), i.e. :c:struct:`DLTensor* <DLTensor>` whose type index is :cpp:enumerator:`TVMFFITypeIndex::kTVMFFIDLTensorPtr`. +* Owning objects as A2 (TVM-FFI tensor object handles), i.e., :cpp:class:`TensorObj* <tvm::ffi::TensorObj>` whose type index is :cpp:enumerator:`TVMFFITypeIndex::kTVMFFITensor`. + +Therefore, when you see a tensor in :cpp:class:`Any <tvm::ffi::Any>` or :cpp:class:`AnyView <tvm::ffi::AnyView>`, +first check its :cpp:member:`type_index <TVMFFIAny::type_index>` to determine whether it is a raw pointer or an object handle +before converting it to the desired tensor type. + +.. important:: + + As a rule of thumb, an owning object can be converted to a non-owning view, but not vice versa. + +To Non-Owning Tensor +~~~~~~~~~~~~~~~~~~~~ + +This converts an owning :cpp:class:`Any <tvm::ffi::Any>` or non-owning :cpp:class:`AnyView <tvm::ffi::AnyView>` into a non-owning tensor. +Two type indices can be converted to a non-owning tensor view: + +- :cpp:enumerator:`TVMFFITypeIndex::kTVMFFIDLTensorPtr`: the payload is a raw pointer :c:struct:`DLTensor* <DLTensor>`. +- :cpp:enumerator:`TVMFFITypeIndex::kTVMFFITensor`: the payload is a TVM-FFI tensor object handle, from which you can extract the underlying :c:struct:`DLTensor` according to the layout defined in :ref:`Figure 1 <fig:layout-tensor>`. + +The snippets below are plain C (C99-compatible) and assume the TVM-FFI C ABI definitions from +``tvm/ffi/c_api.h`` are available. + +.. code-block:: cpp + + // Converts Any/AnyView to DLTensor* + int AnyToDLTensorView(const TVMFFIAny* value, DLTensor** out) { + if (value->type_index == kTVMFFIDLTensorPtr) { + *out = (DLTensor*)value->v_ptr; + return SUCCESS; + } + if (value->type_index == kTVMFFITensor) { + // See Figure 1 for layout of tvm::ffi::TensorObj + TVMFFIObject* obj = value->v_obj; + *out = (DLTensor*)((char*)obj + sizeof(TVMFFIObject)); + return SUCCESS; + } + return FAILURE; + } + +:cpp:class:`TensorView <tvm::ffi::TensorView>` can be constructed directly from the returned :c:struct:`DLTensor* <DLTensor>`. + +To Owning Tensor +~~~~~~~~~~~~~~~~ + +This converts an owning :cpp:class:`Any <tvm::ffi::Any>` or non-owning :cpp:class:`AnyView <tvm::ffi::AnyView>` into an owning :cpp:class:`TensorObj <tvm::ffi::TensorObj>`. Only type index :cpp:enumerator:`TVMFFITypeIndex::kTVMFFITensor` can be converted to an owning tensor because it contains a TVM-FFI tensor object handle. The conversion involves incrementing the reference count to take ownership. + +.. code-block:: cpp + + // Converts Any/AnyView to TensorObj* + int AnyToOwnedTensor(const TVMFFIAny* value, TVMFFIObjectHandle* out) { + if (value->type_index == kTVMFFITensor) { + *out = (TVMFFIObjectHandle)value->v_obj; + return SUCCESS; + } + return FAILURE; + } + +The caller can obtain shared ownership by calling :cpp:func:`TVMFFIObjectIncRef` on the returned handle, +and later release it with :cpp:func:`TVMFFIObjectDecRef`. + +From Owning Tensor +~~~~~~~~~~~~~~~~~~ + +This converts an owning :cpp:class:`TensorObj <tvm::ffi::TensorObj>` to an owning :cpp:class:`Any <tvm::ffi::Any>` or non-owning :cpp:class:`AnyView <tvm::ffi::AnyView>`. It sets the type index to :cpp:enumerator:`TVMFFITypeIndex::kTVMFFITensor` and stores the tensor object handle in the payload. + +.. code-block:: cpp + + // Converts TensorObj* to AnyView + int TensorToAnyView(TVMFFIObjectHandle tensor, TVMFFIAny* out_any_view) { + out_any_view->type_index = kTVMFFITensor; + out_any_view->zero_padding = 0; + out_any_view->v_obj = (TVMFFIObject*)tensor; + return SUCCESS; + } + + // Converts TensorObj* to Any + int TensorToAny(TVMFFIObjectHandle tensor, TVMFFIAny* out_any) { + TVMFFIAny any_view; + int ret = TensorToAnyView(tensor, &any_view); + if (ret != SUCCESS) { + return ret; + } + TVMFFIObjectIncRef(tensor); + *out_any = any_view; + return SUCCESS; + } + +The C API :cpp:func:`TVMFFIObjectIncRef` obtains shared ownership of the tensor into `out_any`. Later, release it with +:cpp:func:`TVMFFIObjectDecRef` on its :cpp:member:`TVMFFIAny::v_obj` field. + +From Non-Owning Tensor +~~~~~~~~~~~~~~~~~~~~~~ + +This converts a non-owning :cpp:class:`TensorView <tvm::ffi::TensorView>` to non-owning :cpp:class:`AnyView <tvm::ffi::AnyView>`. +It sets the type index to :cpp:enumerator:`TVMFFITypeIndex::kTVMFFIDLTensorPtr` and stores a raw pointer to :c:struct:`DLTensor* <DLTensor>` in the payload. + +.. warning:: + + Non-owning :c:struct:`DLTensor` or :cpp:class:`TensorView <tvm::ffi::TensorView>` can be converted to non-owning :cpp:class:`AnyView <tvm::ffi::AnyView>`, but cannot be converted to owning :cpp:class:`Any <tvm::ffi::Any>`. + +.. code-block:: cpp + + // Converts DLTensor* to AnyView + int DLTensorToAnyView(DLTensor* tensor, TVMFFIAny* out) { + if (!tensor || !out) { + return -1; + } + out->type_index = kTVMFFIDLTensorPtr; + out->zero_padding = 0; + out->v_ptr = tensor; + return 0; + } Review Comment:  The return values `0` and `-1` in this code snippet are inconsistent with other example functions in this document, which use `SUCCESS` and `FAILURE`. For consistency, it would be better to use `SUCCESS` and `FAILURE` here as well. ```suggestion int DLTensorToAnyView(DLTensor* tensor, TVMFFIAny* out) { if (!tensor || !out) { return FAILURE; } out->type_index = kTVMFFIDLTensorPtr; out->zero_padding = 0; out->v_ptr = tensor; return SUCCESS; } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
