sameerds added inline comments.
================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:198 + const auto *STI = TM.getMCSubtargetInfo(); + return llvm::AMDGPU::getHostcallImplicitArgPosition(STI); + } ---------------- arsenm wrote: > The ABI should not be a property of the subtarget, and the global subtarget > shouldn't be used I don't understand what the objection is. Existing functions that check the ABI version clearly do so by accessing the subtarget. I am merely following existing practice. I have now rearranged the code a bit. Maybe this works? To be honest, I am not very familiar with how ABI information is tracked. I will heartily welcome any advice on how to retrieve the ABI version and then determine the location of the hostcall implicit arg. ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:521 + + while (!WorkList.empty()) { + auto UseInfo = WorkList.back(); ---------------- arsenm wrote: > Can you use checkForAllUses instead of creating your own worklist? checkForAllUses does not track sufficient state to catch every load that happens to overlap with the hostcall pointer. The only pattern likely to happen in real life is "GEP to offset 24, typecast to i64*, load 8 bytes". But unless we can guarantee that this is the only pattern, we need something robust. @jdoerfert pointed me in the correct direction to make full use of existing machinery, but that is now dependent on D119249 ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:581-584 + if (ArgUsedToRetrieveHostcallPtr(I)) { + return false; + } + return true; ---------------- jdoerfert wrote: > I'd suggest to switch the return value of ArgUsed... so it matches this > functions (and other callbacks). Agreed. In fact switching the return value allowed me to merge the two callbacks into one. I am still keeping the positive name on the outermost function "funcRetrievesHostcallPtr" because it matches the sense near the callsite. ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:566 + return false; + }; + ---------------- jdoerfert wrote: > jdoerfert wrote: > > sameerds wrote: > > > jdoerfert wrote: > > > > You can use AAPointerInfo for the call site return IRPosition. It will > > > > (through the iterations) gather all accesses and put them into "bins" > > > > based on offset and size. It deals with uses in calls, etc. and if > > > > there is stuff missing it is better to add it in one place so we > > > > benefit throughout. > > > I am not following what you have in mind. "implicitarg_ptr" is a pointer > > > returned by an intrinsic that reads an ABI-defined register. I need to > > > check that for a given call-graph, a particular range of bytes relative > > > to that base pointer are never accessed. The above DFS on the uses > > > conservatively assumes that such a load exists unless it can conclusively > > > trace every use of the base pointer. This may include the pointer being > > > passed to an extern function or being stored into a different memory > > > location (although we don't expect ABI registers being capture this way). > > > I am not seeing how to construct this around AAPointerInfo. As far as I > > > can see, the public interface only talks about uses that are recognized > > > as loads and stores. > > Not actually tested, replaces the function body. Depends on D119249. > > ``` > > const auto PointerInfoAA = A.getAAFor<AAPointerInfo>(*this, > > IRPosition::callback_returned(cast<CallBase>(Ptr)), DepClassTy::Required); > > if (!PointerInfoAA.getState().isValidState()) > > return true; // Abort (which is weird as false is abort in the other CB). > > AAPointerInfo::OffsetAndSize OAS(*Position, /* probably look pointer width > > up in DL */ 8); > > return !forallInterferingAccesses(OAS, [](const AAPointerInfo::Access &Acc, > > bool IsExact) { > > return Acc.getRemoteInst()->isDroppable(); }); > > ``` > You don't actually need the state check. > And as I said, this will take care of following pointers passed into callees > or through memory to other places, all while ignoring dead code, etc. I see now. forallInterferingAccesses() does check for valid state on entry, which is sufficient to take care of all the opaque uses like a call to an extern function or a complicated phi or a capturing store. Thanks a ton ... this has been very educational! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D119216/new/ https://reviews.llvm.org/D119216 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits