================
@@ -1708,6 +1710,19 @@ bool 
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
     }
 
     ++Iter;
+    if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) {
+      auto Builder =
+          BuildMI(Block, Iter, DebugLoc(), TII->get(AMDGPU::S_WAITCNT))
+              .addImm(0);
+      if (IsGFX10Plus) {
----------------
jayfoad wrote:
I guess this works but it seems a bit wasteful to insert S_WAITCNT after stores 
and S_WAITCNT_VSCNT after loads. Does anyone care?

Stepping back a bit, I think you can probably implement this by calling 
generateWaitcnt instead of building the instructions yourself.

https://github.com/llvm/llvm-project/pull/68932
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to