================ @@ -669,6 +679,7 @@ define amdgpu_kernel void @global_volatile_store_1( ; GFX12-WGP-NEXT: s_wait_kmcnt 0x0 ; GFX12-WGP-NEXT: s_wait_storecnt 0x0 ; GFX12-WGP-NEXT: global_store_b32 v0, v1, s[0:1] scope:SCOPE_SYS +; GFX12-WGP-NEXT: s_wait_loadcnt 0x3f ---------------- Pierre-vh wrote:
The waitcnts aren't optimized out at O0 because we want to see them in memory legalizer tests, however we're mostly interested in the waitcnt zero, not the waitcnt ~0 We could still optimize out the ~0 ones, I don't think there is a downside to that https://github.com/llvm/llvm-project/pull/147257 _______________________________________________ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits