sameerds added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:198
+    const auto *STI = TM.getMCSubtargetInfo();
+    return llvm::AMDGPU::getHostcallImplicitArgPosition(STI);
+  }
----------------
arsenm wrote:
> The ABI should not be a property of the subtarget, and the global subtarget 
> shouldn't be used
I don't understand what the objection is. Existing functions that check the ABI 
version clearly do so by accessing the subtarget. I am merely following 
existing practice.

I have now rearranged the code a bit. Maybe this works? To be honest, I am not 
very familiar with how ABI information is tracked. I will heartily welcome any 
advice on how to retrieve the ABI version and then determine the location of 
the hostcall implicit arg.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:521
+
+      while (!WorkList.empty()) {
+        auto UseInfo = WorkList.back();
----------------
arsenm wrote:
> Can you use checkForAllUses instead of creating your own worklist?
checkForAllUses does not track sufficient state to catch every load that 
happens to overlap with the hostcall pointer. The only pattern likely to happen 
in real life is "GEP to offset 24, typecast to i64*, load 8 bytes". But unless 
we can guarantee that this is the only pattern, we need something robust.

@jdoerfert pointed me in the correct direction to make full use of existing 
machinery, but that is now dependent on D119249


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:581-584
+      if (ArgUsedToRetrieveHostcallPtr(I)) {
+        return false;
+      }
+      return true;
----------------
jdoerfert wrote:
> I'd suggest to switch the return value of ArgUsed... so it matches this 
> functions (and other callbacks).
Agreed. In fact switching the return value allowed me to merge the two 
callbacks into one. I am still keeping the positive name on the outermost 
function "funcRetrievesHostcallPtr" because it matches the sense near the 
callsite.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:566
+      return false;
+    };
+
----------------
jdoerfert wrote:
> jdoerfert wrote:
> > sameerds wrote:
> > > jdoerfert wrote:
> > > > You can use AAPointerInfo for the call site return IRPosition. It will 
> > > > (through the iterations) gather all accesses and put them into "bins" 
> > > > based on offset and size. It deals with uses in calls, etc. and if 
> > > > there is stuff missing it is better to add it in one place so we 
> > > > benefit throughout. 
> > > I am not following what you have in mind. "implicitarg_ptr" is a pointer 
> > > returned by an intrinsic that reads an ABI-defined register. I need to 
> > > check that for a given call-graph, a particular range of bytes relative 
> > > to that base pointer are never accessed. The above DFS on the uses 
> > > conservatively assumes that such a load exists unless it can conclusively 
> > > trace every use of the base pointer. This may include the pointer being 
> > > passed to an extern function or being stored into a different memory 
> > > location (although we don't expect ABI registers being capture this way). 
> > > I am not seeing how to construct this around AAPointerInfo. As far as I 
> > > can see, the public interface only talks about uses that are recognized 
> > > as loads and stores.
> > Not actually tested, replaces the function body. Depends on D119249.
> > ```
> > const auto PointerInfoAA = A.getAAFor<AAPointerInfo>(*this, 
> > IRPosition::callback_returned(cast<CallBase>(Ptr)), DepClassTy::Required);
> > if (!PointerInfoAA.getState().isValidState())
> >   return true; // Abort (which is weird as false is abort in the other CB).
> > AAPointerInfo::OffsetAndSize OAS(*Position, /* probably look pointer width 
> > up in DL */ 8);
> > return !forallInterferingAccesses(OAS, [](const AAPointerInfo::Access &Acc, 
> > bool IsExact) {
> >    return Acc.getRemoteInst()->isDroppable(); });
> > ```
> You don't actually need the state check.
> And as I said, this will take care of following pointers passed into callees 
> or through memory to other places, all while ignoring dead code, etc.
I see now. forallInterferingAccesses() does check for valid state on entry, 
which is sufficient to take care of all the opaque uses like a call to an 
extern function or a complicated phi or a capturing store. Thanks a ton ... 
this has been very educational!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119216/new/

https://reviews.llvm.org/D119216

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to