Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver integrated in the DRM accel subsystem.
The new docs introduce QDA as a DRM/accel-based implementation of Hexagon DSP offload that is intended as a modern alternative to the legacy FastRPC driver in drivers/misc. The text describes the driver motivation, high-level architecture and interaction with IOMMU context banks, GEM-based buffer management and the RPMsg transport. The user-space facing section documents the main QDA IOCTLs used to establish DSP sessions, manage GEM buffer objects and invoke remote procedures using the FastRPC protocol, along with a typical lifecycle example for applications. Finally, the driver is wired into the Compute Accelerators documentation index under Documentation/accel, and a brief debugging section shows how to enable dynamic debug for the QDA implementation. Signed-off-by: Ekansh Gupta <[email protected]> --- Documentation/accel/index.rst | 1 + Documentation/accel/qda/index.rst | 14 +++++ Documentation/accel/qda/qda.rst | 129 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 144 insertions(+) diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst index cbc7d4c3876a..5901ea7f784c 100644 --- a/Documentation/accel/index.rst +++ b/Documentation/accel/index.rst @@ -10,4 +10,5 @@ Compute Accelerators introduction amdxdna/index qaic/index + qda/index rocket/index diff --git a/Documentation/accel/qda/index.rst b/Documentation/accel/qda/index.rst new file mode 100644 index 000000000000..bce188f21117 --- /dev/null +++ b/Documentation/accel/qda/index.rst @@ -0,0 +1,14 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +============================== + accel/qda Qualcomm DSP Driver +============================== + +The **accel/qda** driver provides support for Qualcomm Hexagon DSPs (Digital +Signal Processors) within the DRM accelerator framework. It serves as a modern +replacement for the legacy FastRPC driver, offering improved resource management +and standard subsystem integration. + +.. toctree:: + + qda diff --git a/Documentation/accel/qda/qda.rst b/Documentation/accel/qda/qda.rst new file mode 100644 index 000000000000..742159841b95 --- /dev/null +++ b/Documentation/accel/qda/qda.rst @@ -0,0 +1,129 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +================================== +Qualcomm Hexagon DSP (QDA) Driver +================================== + +Introduction +============ + +The **QDA** (Qualcomm DSP Accelerator) driver is a new DRM-based +accelerator driver for Qualcomm's Hexagon DSPs. It provides a standardized +interface for user-space applications to offload computational tasks ranging +from audio processing and sensor offload to computer vision and AI +inference to the Hexagon DSPs found on Qualcomm SoCs. + +This driver is designed to align with the Linux kernel's modern **Compute +Accelerators** subsystem (`drivers/accel/`), providing a robust and modular +alternative to the legacy FastRPC driver in `drivers/misc/`, offering +improved resource management and better integration with standard kernel +subsystems. + +Motivation +========== + +The existing FastRPC implementation in the kernel utilizes a custom character +device and lacks integration with modern kernel memory management frameworks. +The QDA driver addresses these limitations by: + +1. **Adopting the DRM accel Framework**: Leveraging standard uAPIs for device + management, job submission, and synchronization. +2. **Utilizing GEM for Memory**: Providing proper buffer object management, + including DMA-BUF import/export capabilities. +3. **Improving Isolation**: Using IOMMU context banks to enforce memory + isolation between different DSP user sessions. + +Key Features +============ + +* **Standard Accelerator Interface**: Exposes a standard character device + node (e.g., `/dev/accel/accel0`) via the DRM subsystem. +* **Unified Offload Support**: Supports all DSP domains (ADSP, CDSP, SDSP, + GDSP) via a single driver architecture. +* **FastRPC Protocol**: Implements the reliable Remote Procedure Call + (FastRPC) protocol for communication between the application processor + and DSP. +* **DMA-BUF Interop**: Seamless sharing of memory buffers between the DSP + and other multimedia subsystems (GPU, Camera, Video) via standard DMA-BUFs. +* **Modular Design**: Clean separation between the core DRM logic, the memory + manager, and the RPMsg-based transport layer. + +Architecture +============ + +The QDA driver is composed of several modular components: + +1. **Core Driver (`qda_drv`)**: Manages device registration, file operations, + and bridges the driver with the DRM accelerator subsystem. +2. **Memory Manager (`qda_memory_manager`)**: A flexible memory management + layer that handles IOMMU context banks. It supports pluggable backends + (such as DMA-coherent) to adapt to different SoC memory architectures. +3. **GEM Subsystem**: Implements the DRM GEM interface for buffer management: + + * **`qda_gem`**: Core GEM object management, including allocation, mmap + operations, and buffer lifecycle management. + * **`qda_prime`**: PRIME import functionality for DMA-BUF interoperability, + enabling seamless buffer sharing with other kernel subsystems. + +4. **Transport Layer (`qda_rpmsg`)**: Abstraction over the RPMsg framework + to handle low-level message passing with the DSP firmware. +5. **Compute Bus (`qda_compute_bus`)**: A custom virtual bus used to + enumerate and manage the specific compute context banks defined in the + device tree. +6. **FastRPC Core (`qda_fastrpc`)**: Implements the protocol logic for + marshalling arguments and handling remote invocations. + +User-Space API +============== + +The driver exposes a set of DRM-compliant IOCTLs. Note that these are designed +to be familiar to existing FastRPC users while adhering to DRM standards. + +* `DRM_IOCTL_QDA_QUERY`: Query DSP type (e.g., "cdsp", "adsp") + and capabilities. +* `DRM_IOCTL_QDA_INIT_ATTACH`: Attach a user session to the DSP's protection + domain. +* `DRM_IOCTL_QDA_INIT_CREATE`: Initialize a new process context on the DSP. +* `DRM_IOCTL_QDA_INVOKE`: Submit a remote method invocation (the primary + execution unit). +* `DRM_IOCTL_QDA_GEM_CREATE`: Allocate a GEM buffer object for DSP usage. +* `DRM_IOCTL_QDA_GEM_MMAP_OFFSET`: Retrieve mmap offsets for memory mapping. +* `DRM_IOCTL_QDA_MAP` / `DRM_IOCTL_QDA_MUNMAP`: Map or unmap buffers into the + DSP's virtual address space. + +Usage Example +============= + +A typical lifecycle for a user-space application: + +1. **Discovery**: Open `/dev/accel/accel*` and check + `DRM_IOCTL_QDA_QUERY` to find the desired DSP (e.g., CDSP for + compute workloads). +2. **Initialization**: Call `DRM_IOCTL_QDA_INIT_ATTACH` and + `DRM_IOCTL_QDA_INIT_CREATE` to establish a session. +3. **Memory**: Allocate buffers via `DRM_IOCTL_QDA_GEM_CREATE` or import + DMA-BUFs (PRIME fd) from other drivers using `DRM_IOCTL_PRIME_FD_TO_HANDLE`. +4. **Execution**: Use `DRM_IOCTL_QDA_INVOKE` to pass arguments and execute + functions on the DSP. +5. **Cleanup**: Close file descriptors to automatically release resources and + detach the session. + +Internal Implementation +======================= + +Memory Management +----------------- +The driver's memory manager creates virtual "IOMMU devices" that map to +hardware context banks. This allows the driver to manage multiple isolated +address spaces. The implementation currently uses a **DMA-coherent backend** +to ensure data consistency between the CPU and DSP without manual cache +maintenance in most cases. + +Debugging +========= +The driver includes extensive dynamic debug support. Enable it via the +kernel's dynamic debug control: + +.. code-block:: bash + + echo "file drivers/accel/qda/* +p" > /sys/kernel/debug/dynamic_debug/control -- 2.34.1
