================
@@ -669,6 +679,7 @@ define amdgpu_kernel void @global_volatile_store_1(
 ; GFX12-WGP-NEXT:    s_wait_kmcnt 0x0
 ; GFX12-WGP-NEXT:    s_wait_storecnt 0x0
 ; GFX12-WGP-NEXT:    global_store_b32 v0, v1, s[0:1] scope:SCOPE_SYS
+; GFX12-WGP-NEXT:    s_wait_loadcnt 0x3f
----------------
ssahasra wrote:

Not directly related to this discussion, but this line does exist:
```
   1390       // Merge consecutive waitcnt of the same type by erasing 
multiples.
   1391       if (WaitcntInstr || (!Wait.hasWaitExceptStoreCnt() && 
TrySimplify)) {
```
It is meant to preserver S_WAITCNT_soft even if there is no actual wait 
required. @jayfoad , you had introduced `TrySimplify` ... do you think it is 
okay to relax its uses?

```
   1373       if (TrySimplify **|| (Opcode != II.getOpcode() && 
OldWait.hasValuesSetToMax()**)
   1374         ScoreBrackets.simplifyWaitcnt(OldWait);
```
Here, `hasValuesSetToMax()` is a hypothetical function that checks the encoding 
of each count separately to have all bits set to 1, and not just a ~0 in the 
data structure.

https://github.com/llvm/llvm-project/pull/147257
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to