Private bug reported:

Compute Express Link (CXL) enables shared access to memory and
accelerators across multiple hosts, particularly in disaggregated and
composable infrastructure environments. As multiple initiators (hosts,
CPUs, accelerators) contend for shared CXL resources (memory devices,
switches, and fabrics), ensuring predictable performance becomes
critical.

Quality of Service (QoS) in CXL fabrics provides mechanisms to manage
bandwidth allocation, latency prioritization, and fairness across
competing workloads. This includes traffic classification,
prioritization, throttling, and congestion management across CXL.io,
CXL.cache, and CXL.mem protocols.

With the evolution toward multi-level switching and fabric-based
architectures (especially in CXL 3.0+), QoS becomes essential to
guarantee service levels for latency-sensitive and high-priority
workloads while preventing noisy neighbor issues.

In the Linux kernel, current CXL support focuses on device enumeration
and memory management, with limited exposure of QoS controls. Enabling
QoS-aware scheduling, monitoring, and control mechanisms at the OS level
is necessary to fully leverage CXL fabrics in multi-tenant and high-
performance environments.

Feature Request:
Requested details to be enabled on OS:
  Enable QoS capability discovery for CXL devices and switches. 
  Support configuration of bandwidth allocation and priority levels across CXL 
traffic classes. 
  Expose QoS controls via sysfs/debugfs or standardized user-space APIs. 
  Integrate QoS policies with memory management and NUMA balancing. 
  Support latency-sensitive workload prioritization (e.g., AI/ML, real-time 
applications). 
  Enable congestion detection and dynamic throttling mechanisms. 
  Provide telemetry and monitoring for QoS metrics (latency, bandwidth, 
contention). 
  Support QoS enforcement across multi-level CXL switch fabrics. 
  Enable QoS support in virtualized environments (KVM/QEMU, containers). 
  Integrate with orchestration frameworks for workload-aware QoS provisioning. 
  Provide validation and benchmarking tools for QoS effectiveness. 
  Document QoS configuration, tuning, and best practices for CXL deployments.

Business Justification:
  Ensures predictable performance in shared CXL environments. 
  Prevents resource contention and noisy neighbor issues in multi-tenant 
systems. 
  Enables SLA-driven infrastructure for cloud and hyperscale deployments. 
  Improves efficiency and utilization of shared memory and accelerator 
resources. 
  Supports latency-sensitive workloads requiring deterministic behavior. 
  Aligns with next-generation data center requirements for composable 
infrastructure.

References:
  CXL 2.0 / 3.0 Specifications (QoS and Fabric Management) 
  Linux Kernel CXL Subsystem Documentation 
  Data Center QoS and Resource Management Whitepapers 
  Industry Research on Memory Disaggregation and Performance Isolation

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

** Information type changed from Public to Private

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2146662

Title:
  Request for CXL QoS Enablement and Enhancements

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2146662/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to