================
@@ -15136,6 +15136,41 @@ SDValue SITargetLowering::performPtrAddCombine(SDNode 
*N,
       return Folded;
   }
 
+  // Transform (ptradd a, b) -> (or disjoint a, b) if it is equivalent and if
+  // that transformation can't block an offset folding at any use of the 
ptradd.
+  // This should be done late, after legalization, so that it doesn't block
+  // other ptradd combines that could enable more offset folding.
+  bool HasIntermediateAssertAlign =
+      N0->getOpcode() == ISD::AssertAlign && N0->getOperand(0)->isAnyAdd();
+  // This is a hack to work around an ordering problem for DAGs like this:
+  //   (ptradd (AssertAlign (ptradd p, c1), k), c2)
+  // If the outer ptradd is handled first by the DAGCombiner, it can be
+  // transformed into a disjoint or. Then, when the generic AssertAlign combine
+  // pushes the AssertAlign through the inner ptradd, it's too late for the
+  // ptradd reassociation to trigger.
+  if (!DCI.isBeforeLegalizeOps() && !HasIntermediateAssertAlign &&
+      DAG.haveNoCommonBitsSet(N0, N1)) {
+    bool TransformCanBreakAddrMode = any_of(N->users(), [&](SDNode *User) {
+      if (auto *LoadStore = dyn_cast<MemSDNode>(User);
+          LoadStore && LoadStore->getBasePtr().getNode() == N) {
+        unsigned AS = LoadStore->getAddressSpace();
+        // Currently, we only really need ptradds to fold offsets into flat
+        // memory instructions.
+        if (AS != AMDGPUAS::FLAT_ADDRESS)
+          return false;
+        TargetLoweringBase::AddrMode AM;
+        AM.HasBaseReg = true;
+        EVT VT = LoadStore->getMemoryVT();
+        Type *AccessTy = VT.getTypeForEVT(*DAG.getContext());
+        return isLegalAddressingMode(DAG.getDataLayout(), AM, AccessTy, AS);
+      }
+      return false;
----------------
ritter-x2a wrote:

I'm not sure if every backend that could want to use ptradd nodes would want to 
transform them to ORs. However, there is probably not much point in developing 
for hypothetical backends, so I moved it to the generic combines for now, 
behind the `canTransformPtrArithOutOfBounds()` check (I also fixed it to 
actually check the intended addressing mode).

Dropping the target-specific `AS != AMDGPUAS::FLAT_ADDRESS` check affects the 
generated code in two lit tests 
([identical-subrange-spill-infloop.ll](https://github.com/llvm/llvm-project/pull/146076/files#diff-517b7174ca71caeed2dd13ec440ee963e4db61f01911ff1cbc86ab0e60f16721)
 and 
[store-weird-sizes.ll](https://github.com/llvm/llvm-project/pull/146076/files#diff-32010dfaf8188291719044adb5c6e927b17fe3e3657e0f27ebe2e2a10a020889)).
But, looking more into it, I find that
- the new code for `identical-subrange-spill-infloop.ll` could be argued to be 
an improvement (offsets for SMRD instructions are folded where they weren't 
before) and
- the problem for `store-weird-sizes.ll` seems to be that 
`SITargetLowering::isLegalAddressingMode` is overly optimistic when asked if 
"register + constant offset" is a valid addressing mode for AS4 on 
architectures predating `global_*` instructions. So this should be fixed there. 

We already have a [generic 
combine](https://github.com/llvm/llvm-project/blob/629126ed44bd3ce5b6f476459c805be4e4e0c2ca/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L15173-L15196)
 that pushes the AssertAlign through the (ptr)add.
The handling here was necessary because that combine would only be applied 
after the outer PTRADD was already visited and combined into an OR. However, 
that doesn't seem to happen anymore in the tests when it's a generic combine, 
so I dropped this handling as well.

https://github.com/llvm/llvm-project/pull/146075
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to