Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
integrated in the DRM accel subsystem.

The new docs introduce QDA as a DRM/accel-based implementation of
Hexagon DSP offload that is intended as a modern alternative to the
legacy FastRPC driver in drivers/misc. The text describes the driver
motivation, high-level architecture and interaction with IOMMU context
banks, GEM-based buffer management and the RPMsg transport.

The user-space facing section documents the main QDA IOCTLs used to
establish DSP sessions, manage GEM buffer objects and invoke remote
procedures using the FastRPC protocol, along with a typical lifecycle
example for applications.

Finally, the driver is wired into the Compute Accelerators
documentation index under Documentation/accel, and a brief debugging
section shows how to enable dynamic debug for the QDA implementation.

Signed-off-by: Ekansh Gupta <[email protected]>
---
 Documentation/accel/index.rst     |   1 +
 Documentation/accel/qda/index.rst |  14 +++++
 Documentation/accel/qda/qda.rst   | 129 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 144 insertions(+)

diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst
index cbc7d4c3876a..5901ea7f784c 100644
--- a/Documentation/accel/index.rst
+++ b/Documentation/accel/index.rst
@@ -10,4 +10,5 @@ Compute Accelerators
    introduction
    amdxdna/index
    qaic/index
+   qda/index
    rocket/index
diff --git a/Documentation/accel/qda/index.rst 
b/Documentation/accel/qda/index.rst
new file mode 100644
index 000000000000..bce188f21117
--- /dev/null
+++ b/Documentation/accel/qda/index.rst
@@ -0,0 +1,14 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+==============================
+ accel/qda Qualcomm DSP Driver
+==============================
+
+The **accel/qda** driver provides support for Qualcomm Hexagon DSPs (Digital
+Signal Processors) within the DRM accelerator framework. It serves as a modern
+replacement for the legacy FastRPC driver, offering improved resource 
management
+and standard subsystem integration.
+
+.. toctree::
+
+   qda
diff --git a/Documentation/accel/qda/qda.rst b/Documentation/accel/qda/qda.rst
new file mode 100644
index 000000000000..742159841b95
--- /dev/null
+++ b/Documentation/accel/qda/qda.rst
@@ -0,0 +1,129 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+==================================
+Qualcomm Hexagon DSP (QDA) Driver
+==================================
+
+Introduction
+============
+
+The **QDA** (Qualcomm DSP Accelerator) driver is a new DRM-based
+accelerator driver for Qualcomm's Hexagon DSPs. It provides a standardized
+interface for user-space applications to offload computational tasks ranging
+from audio processing and sensor offload to computer vision and AI
+inference to the Hexagon DSPs found on Qualcomm SoCs.
+
+This driver is designed to align with the Linux kernel's modern **Compute
+Accelerators** subsystem (`drivers/accel/`), providing a robust and modular
+alternative to the legacy FastRPC driver in `drivers/misc/`, offering
+improved resource management and better integration with standard kernel
+subsystems.
+
+Motivation
+==========
+
+The existing FastRPC implementation in the kernel utilizes a custom character
+device and lacks integration with modern kernel memory management frameworks.
+The QDA driver addresses these limitations by:
+
+1.  **Adopting the DRM accel Framework**: Leveraging standard uAPIs for device
+    management, job submission, and synchronization.
+2.  **Utilizing GEM for Memory**: Providing proper buffer object management,
+    including DMA-BUF import/export capabilities.
+3.  **Improving Isolation**: Using IOMMU context banks to enforce memory
+    isolation between different DSP user sessions.
+
+Key Features
+============
+
+*   **Standard Accelerator Interface**: Exposes a standard character device
+    node (e.g., `/dev/accel/accel0`) via the DRM subsystem.
+*   **Unified Offload Support**: Supports all DSP domains (ADSP, CDSP, SDSP,
+    GDSP) via a single driver architecture.
+*   **FastRPC Protocol**: Implements the reliable Remote Procedure Call
+    (FastRPC) protocol for communication between the application processor
+    and DSP.
+*   **DMA-BUF Interop**: Seamless sharing of memory buffers between the DSP
+    and other multimedia subsystems (GPU, Camera, Video) via standard DMA-BUFs.
+*   **Modular Design**: Clean separation between the core DRM logic, the memory
+    manager, and the RPMsg-based transport layer.
+
+Architecture
+============
+
+The QDA driver is composed of several modular components:
+
+1.  **Core Driver (`qda_drv`)**: Manages device registration, file operations,
+    and bridges the driver with the DRM accelerator subsystem.
+2.  **Memory Manager (`qda_memory_manager`)**: A flexible memory management
+    layer that handles IOMMU context banks. It supports pluggable backends
+    (such as DMA-coherent) to adapt to different SoC memory architectures.
+3.  **GEM Subsystem**: Implements the DRM GEM interface for buffer management:
+
+    * **`qda_gem`**: Core GEM object management, including allocation, mmap
+      operations, and buffer lifecycle management.
+    * **`qda_prime`**: PRIME import functionality for DMA-BUF interoperability,
+      enabling seamless buffer sharing with other kernel subsystems.
+
+4.  **Transport Layer (`qda_rpmsg`)**: Abstraction over the RPMsg framework
+    to handle low-level message passing with the DSP firmware.
+5.  **Compute Bus (`qda_compute_bus`)**: A custom virtual bus used to
+    enumerate and manage the specific compute context banks defined in the
+    device tree.
+6.  **FastRPC Core (`qda_fastrpc`)**: Implements the protocol logic for
+    marshalling arguments and handling remote invocations.
+
+User-Space API
+==============
+
+The driver exposes a set of DRM-compliant IOCTLs. Note that these are designed
+to be familiar to existing FastRPC users while adhering to DRM standards.
+
+*   `DRM_IOCTL_QDA_QUERY`: Query DSP type (e.g., "cdsp", "adsp")
+    and capabilities.
+*   `DRM_IOCTL_QDA_INIT_ATTACH`: Attach a user session to the DSP's protection
+    domain.
+*   `DRM_IOCTL_QDA_INIT_CREATE`: Initialize a new process context on the DSP.
+*   `DRM_IOCTL_QDA_INVOKE`: Submit a remote method invocation (the primary
+    execution unit).
+*   `DRM_IOCTL_QDA_GEM_CREATE`: Allocate a GEM buffer object for DSP usage.
+*   `DRM_IOCTL_QDA_GEM_MMAP_OFFSET`: Retrieve mmap offsets for memory mapping.
+*   `DRM_IOCTL_QDA_MAP` / `DRM_IOCTL_QDA_MUNMAP`: Map or unmap buffers into the
+    DSP's virtual address space.
+
+Usage Example
+=============
+
+A typical lifecycle for a user-space application:
+
+1.  **Discovery**: Open `/dev/accel/accel*` and check
+    `DRM_IOCTL_QDA_QUERY` to find the desired DSP (e.g., CDSP for
+    compute workloads).
+2.  **Initialization**: Call `DRM_IOCTL_QDA_INIT_ATTACH` and
+    `DRM_IOCTL_QDA_INIT_CREATE` to establish a session.
+3.  **Memory**: Allocate buffers via `DRM_IOCTL_QDA_GEM_CREATE` or import
+    DMA-BUFs (PRIME fd) from other drivers using 
`DRM_IOCTL_PRIME_FD_TO_HANDLE`.
+4.  **Execution**: Use `DRM_IOCTL_QDA_INVOKE` to pass arguments and execute
+    functions on the DSP.
+5.  **Cleanup**: Close file descriptors to automatically release resources and
+    detach the session.
+
+Internal Implementation
+=======================
+
+Memory Management
+-----------------
+The driver's memory manager creates virtual "IOMMU devices" that map to
+hardware context banks. This allows the driver to manage multiple isolated
+address spaces. The implementation currently uses a **DMA-coherent backend**
+to ensure data consistency between the CPU and DSP without manual cache
+maintenance in most cases.
+
+Debugging
+=========
+The driver includes extensive dynamic debug support. Enable it via the
+kernel's dynamic debug control:
+
+.. code-block:: bash
+
+    echo "file drivers/accel/qda/* +p" > 
/sys/kernel/debug/dynamic_debug/control

-- 
2.34.1

Reply via email to