================ @@ -669,6 +679,7 @@ define amdgpu_kernel void @global_volatile_store_1( ; GFX12-WGP-NEXT: s_wait_kmcnt 0x0 ; GFX12-WGP-NEXT: s_wait_storecnt 0x0 ; GFX12-WGP-NEXT: global_store_b32 v0, v1, s[0:1] scope:SCOPE_SYS +; GFX12-WGP-NEXT: s_wait_loadcnt 0x3f ---------------- ssahasra wrote:
Yes, I did consider that as an option. But there is the hypothetical corner case where the memory legalizer might deliberately compute the wait count to be so large that it gets clamped at the max value (not the same as ~0, strictly speaking). If that is not an issue, it will significantly reduce the diff for tests that don't stop after the legalizer. https://github.com/llvm/llvm-project/pull/147257 _______________________________________________ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits