Public bug reported: [Impact] rocblas is the AMD ROCm Basic Linear Algebra Subprograms library, providing GPU-accelerated BLAS routines. This update from 7.1.0 to 7.2.4 is part of the coordinated ROCm stack update in Ubuntu Resolute.
Key improvements: - Level 2 optimizations for tpmv and sbmv functions (performance) - New rocblas_syrk_ex API enabling mixed-precision symmetric rank-k updates (bf16/f16 input with f32 accumulation, f32 input with f64 accumulation). This is required by downstream consumers (MIOpen, hipBLAS) that use mixed-precision GEMM-like operations for AI/ML training workloads on AMD GPUs. - Memory allocation behavior change: the default allocation strategy is now standard hipMalloc instead of stream-order allocation (hipMallocAsync). The previous default could cause issues with certain HIP runtime versions and multi-stream workloads. Users who relied on stream-order allocation can restore it by setting ROCBLAS_STREAM_ORDER_ALLOC=1. ABI analysis (abipkgdiff): 0 removed functions, 0 changed functions, 8 added symbols (syrk_ex templates + device_allocator). The only removed symbols are 453 __hip_cuid_* variables which are auto-generated compile-unit identifiers from the LLVM/HIP toolchain — not part of any public or stable ABI. Reverse dependencies of librocblas5 in resolute: - ROCm stack internal: librocsolver0, libmiopen1, libhipsolver1, libhipblas3, librocblas5-bench, librocblas5-tests, librocwmma-tests-validate, librocsolver0-bench, librocsolver0-tests, libmiopen1-tests - External consumers: libtorch-rocm-2.9, libggml0-backend-hip All reverse dependencies are either part of the same coordinated ROCm update or are PyTorch/GGML backends that link against the stable C API (which has no removals). The memory allocation default change uses standard hipMalloc which is the more conservative/compatible path — no external consumer would have depended on the stream-order behavior as it was internal to the handle. [Test Plan] 1. Build rocblas 7.2.4 in the resolute PPA and verify it produces librocblas5. 2. Run the rocblas-test suite (librocblas5-tests) on a system with a supported AMD GPU to verify GEMM, TRSM, SYRK correctness. 3. Verify all reverse dependencies (rocsolver, miopen, hipsolver, hipblas) build successfully against the updated librocblas5. 4. Confirm ABI compatibility via abipkgdiff (already done — no function removals or changes). [Where problems could occur] - Memory allocation default change: applications that created/destroyed rocblas handles at high frequency may see different allocation timing characteristics. Mitigation: set ROCBLAS_STREAM_ORDER_ALLOC=1 to restore previous behavior. Risk is low since the new default (hipMalloc) is the more traditional/conservative approach. - Tensile codegen updated (4.44.0 to 4.45.0): kernel selection YAML tuning files changed for gfx942, gfx1103, and strixhalo targets. Could theoretically select different kernels for certain problem sizes, though correctness tests cover this. - New syrk_ex kernels: new code path, but additive only and gated behind explicit API calls — cannot affect existing workloads. Full abigail report: https://pastebin.ubuntu.com/p/fcxCsZBPNn/ ** Affects: rocblas (Ubuntu) Importance: Undecided Status: New ** Affects: rocblas (Ubuntu Resolute) Importance: Undecided Status: New ** Also affects: rocblas (Ubuntu Resolute) Importance: Undecided Status: New ** Description changed: [Impact] rocblas is the AMD ROCm Basic Linear Algebra Subprograms library, providing GPU-accelerated BLAS routines. This update from 7.1.0 to 7.2.4 is part of the coordinated ROCm stack update in Ubuntu Resolute. Key improvements: - Level 2 optimizations for tpmv and sbmv functions (performance) - New rocblas_syrk_ex API enabling mixed-precision symmetric rank-k updates - (bf16/f16 input with f32 accumulation, f32 input with f64 accumulation). - This is required by downstream consumers (MIOpen, hipBLAS) that use - mixed-precision GEMM-like operations for AI/ML training workloads on AMD GPUs. + (bf16/f16 input with f32 accumulation, f32 input with f64 accumulation). + This is required by downstream consumers (MIOpen, hipBLAS) that use + mixed-precision GEMM-like operations for AI/ML training workloads on AMD GPUs. - Memory allocation behavior change: the default allocation strategy is now - standard hipMalloc instead of stream-order allocation (hipMallocAsync). The - previous default could cause issues with certain HIP runtime versions and - multi-stream workloads. Users who relied on stream-order allocation can - restore it by setting ROCBLAS_STREAM_ORDER_ALLOC=1. + standard hipMalloc instead of stream-order allocation (hipMallocAsync). The + previous default could cause issues with certain HIP runtime versions and + multi-stream workloads. Users who relied on stream-order allocation can + restore it by setting ROCBLAS_STREAM_ORDER_ALLOC=1. ABI analysis (abipkgdiff): 0 removed functions, 0 changed functions, 8 added symbols (syrk_ex templates + device_allocator). The only removed symbols are 453 __hip_cuid_* variables which are auto-generated compile-unit identifiers from the LLVM/HIP toolchain — not part of any public or stable ABI. Reverse dependencies of librocblas5 in resolute: - ROCm stack internal: librocsolver0, libmiopen1, libhipsolver1, libhipblas3, - librocblas5-bench, librocblas5-tests, librocwmma-tests-validate, - librocsolver0-bench, librocsolver0-tests, libmiopen1-tests + librocblas5-bench, librocblas5-tests, librocwmma-tests-validate, + librocsolver0-bench, librocsolver0-tests, libmiopen1-tests - External consumers: libtorch-rocm-2.9, libggml0-backend-hip All reverse dependencies are either part of the same coordinated ROCm update or are PyTorch/GGML backends that link against the stable C API (which has no removals). The memory allocation default change uses standard hipMalloc which is the more conservative/compatible path — no external consumer would have depended on the stream-order behavior as it was internal to the handle. [Test Plan] 1. Build rocblas 7.2.4 in the resolute PPA and verify it produces librocblas5. 2. Run the rocblas-test suite (librocblas5-tests) on a system with a supported - AMD GPU to verify GEMM, TRSM, SYRK correctness. + AMD GPU to verify GEMM, TRSM, SYRK correctness. 3. Verify all reverse dependencies (rocsolver, miopen, hipsolver, hipblas) - build successfully against the updated librocblas5. + build successfully against the updated librocblas5. 4. Confirm ABI compatibility via abipkgdiff (already done — no function - removals or changes). + removals or changes). [Where problems could occur] - Memory allocation default change: applications that created/destroyed - rocblas handles at high frequency may see different allocation timing - characteristics. Mitigation: set ROCBLAS_STREAM_ORDER_ALLOC=1 to restore - previous behavior. Risk is low since the new default (hipMalloc) is the - more traditional/conservative approach. + rocblas handles at high frequency may see different allocation timing + characteristics. Mitigation: set ROCBLAS_STREAM_ORDER_ALLOC=1 to restore + previous behavior. Risk is low since the new default (hipMalloc) is the + more traditional/conservative approach. - Tensile codegen updated (4.44.0 to 4.45.0): kernel selection YAML tuning - files changed for gfx942, gfx1103, and strixhalo targets. Could - theoretically select different kernels for certain problem sizes, though - correctness tests cover this. + files changed for gfx942, gfx1103, and strixhalo targets. Could + theoretically select different kernels for certain problem sizes, though + correctness tests cover this. - New syrk_ex kernels: new code path, but additive only and gated behind - explicit API calls — cannot affect existing workloads. + explicit API calls — cannot affect existing workloads. + + Full abigail report: https://pastebin.ubuntu.com/p/fcxCsZBPNn/ -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2155118 Title: [SRU] Update rocblas to 7.2.4 in resolute To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/rocblas/+bug/2155118/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
