date:20241002

[llvm-branch-commits] [llvm] [AArch64][PAC] Move emission of LR checks in tail calls to AsmPrinter (PR #110705)

2024-10-02 Thread Anatoly Trosinenko via llvm-branch-commits


atrosinenko wrote:

Updated the commit message: removed the paragraph about dropping one `mov` 
instruction from the non-trapping variant of check. That was my initial idea to 
make non-hint `xpac(i|d)` operate on the tested register itself (just like 
`xpaclri` does), but it was removed from the final version of the patch, to not 
make unnecessary changes to the tests.

https://github.com/llvm/llvm-project/pull/110705
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Custom expand flat cmpxchg which may access private (PR #109410)

2024-10-02 Thread Pierre van Houtryve via llvm-branch-commits



@@ -43,7 +43,7 @@ define i64 @test_flat_atomicrmw_sub_0_i64_agent(ptr %ptr) {
 ; ALL:   [[ATOMICRMW_PRIVATE]]:
 ; ALL-NEXT:[[TMP1:%.*]] = addrspacecast ptr [[PTR]] to ptr addrspace(5)
 ; ALL-NEXT:[[LOADED_PRIVATE:%.*]] = load i64, ptr addrspace(5) [[TMP1]], 
align 8
-; ALL-NEXT:[[NEW:%.*]] = sub i64 [[LOADED_PRIVATE]], 0
+; ALL-NEXT:[[NEW:%.*]] = add i64 [[LOADED_PRIVATE]], 0

Pierre-vh wrote:

Why does this transform happen more often now?

https://github.com/llvm/llvm-project/pull/109410
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)

2024-10-02 Thread via llvm-branch-commits


dlav-sc wrote:

@topperc @kito-cheng FYI

https://github.com/llvm/llvm-project/pull/110810
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/110815

This allows selecting the addressing mode for stack instructions
in cases where we need to prove the sign bit is zero.

>From 56474dac206d8592229cb56e1f12b543ec97 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 2 Oct 2024 11:20:23 +0400
Subject: [PATCH] DAG: Preserve more flags when expanding gep

This allows selecting the addressing mode for stack instructions
in cases where we need to prove the sign bit is zero.
---
 .../SelectionDAG/SelectionDAGBuilder.cpp  | 41 +++
 .../CodeGen/AMDGPU/gep-flags-stack-offsets.ll |  6 +--
 .../pointer-add-unknown-offset-debug-info.ll  |  2 +-
 3 files changed, 36 insertions(+), 13 deletions(-)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 25213f587116d5..6838c0b530a363 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4386,6 +4386,17 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // it.
   IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType());
 
+  SDNodeFlags ScaleFlags;
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in a signed sense (mul nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+ScaleFlags.setNoSignedWrap(true);
+
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in an unsigned sense (mul nuw).
+  if (NW.hasNoUnsignedWrap())
+ScaleFlags.setNoUnsignedWrap(true);
+
   if (ElementScalable) {
 EVT VScaleTy = N.getValueType().getScalarType();
 SDValue VScale = DAG.getNode(
@@ -4393,27 +4404,41 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy));
 if (IsVectorGEP)
   VScale = DAG.getSplatVector(N.getValueType(), dl, VScale);
-IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale,
+   ScaleFlags);
   } else {
 // If this is a multiply by a power of two, turn it into a shl
 // immediately.  This is a very common case.
 if (ElementMul != 1) {
   if (ElementMul.isPowerOf2()) {
 unsigned Amt = ElementMul.logBase2();
-IdxN = DAG.getNode(ISD::SHL, dl,
-   N.getValueType(), IdxN,
-   DAG.getConstant(Amt, dl, IdxN.getValueType()));
+IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN,
+   DAG.getConstant(Amt, dl, IdxN.getValueType()),
+   ScaleFlags);
   } else {
 SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl,
 IdxN.getValueType());
-IdxN = DAG.getNode(ISD::MUL, dl,
-   N.getValueType(), IdxN, Scale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale,
+   ScaleFlags);
   }
 }
   }
 
-  N = DAG.getNode(ISD::ADD, dl,
-  N.getValueType(), N, IdxN);
+  SDNodeFlags AddFlags;
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in a signed sense (add
+  // nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+AddFlags.setNoSignedWrap(true);
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in an unsigned sense 
(add
+  // nuw).
+  if (NW.hasNoUnsignedWrap())
+AddFlags.setNoUnsignedWrap(true);
+
+  N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags);
 }
   }
 
diff --git a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll 
b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll
index 782894976c711c..a39afa6f609c7e 100644
--- a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll
+++ b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll
@@ -118,8 +118,7 @@ define void @gep_inbounds_nuw_alloca(i32 %idx, i32 %val) #0 
{
 ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32
 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0
-; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0
-; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen
+; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen offset:16
 ; GFX8-NEXT:s_waitcnt vmcnt(0)
 ; GFX8-NEXT:s_setpc_b64 s[30:31]
 ;
@@ -145,8 +144,7 @@ define void @gep_nusw_nuw_alloca(i32 %idx, i32 %val) #0 {
 ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32
 ; GFX8-NEXT:v_add_u32_e32 v0, v

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits


arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/110815?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#110815** https://app.graphite.dev/github/pr/llvm/llvm-project/110815?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#110814** https://app.graphite.dev/github/pr/llvm/llvm-project/110814?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/110815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] fix RISCVPushPopOptimizer pass (PR #110813)

2024-10-02 Thread via llvm-branch-commits


dlav-sc wrote:

@topperc @michaelmaitland FYI

https://github.com/llvm/llvm-project/pull/110813
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-selectiondag

Author: Matt Arsenault (arsenm)


Changes

This allows selecting the addressing mode for stack instructions
in cases where we need to prove the sign bit is zero.

---
Full diff: https://github.com/llvm/llvm-project/pull/110815.diff


3 Files Affected:

- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+33-8) 
- (modified) llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll (+2-4) 
- (modified) llvm/test/DebugInfo/Sparc/pointer-add-unknown-offset-debug-info.ll 
(+1-1) 


``diff
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 25213f587116d5..6838c0b530a363 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4386,6 +4386,17 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // it.
   IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType());
 
+  SDNodeFlags ScaleFlags;
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in a signed sense (mul nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+ScaleFlags.setNoSignedWrap(true);
+
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in an unsigned sense (mul nuw).
+  if (NW.hasNoUnsignedWrap())
+ScaleFlags.setNoUnsignedWrap(true);
+
   if (ElementScalable) {
 EVT VScaleTy = N.getValueType().getScalarType();
 SDValue VScale = DAG.getNode(
@@ -4393,27 +4404,41 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy));
 if (IsVectorGEP)
   VScale = DAG.getSplatVector(N.getValueType(), dl, VScale);
-IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale,
+   ScaleFlags);
   } else {
 // If this is a multiply by a power of two, turn it into a shl
 // immediately.  This is a very common case.
 if (ElementMul != 1) {
   if (ElementMul.isPowerOf2()) {
 unsigned Amt = ElementMul.logBase2();
-IdxN = DAG.getNode(ISD::SHL, dl,
-   N.getValueType(), IdxN,
-   DAG.getConstant(Amt, dl, IdxN.getValueType()));
+IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN,
+   DAG.getConstant(Amt, dl, IdxN.getValueType()),
+   ScaleFlags);
   } else {
 SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl,
 IdxN.getValueType());
-IdxN = DAG.getNode(ISD::MUL, dl,
-   N.getValueType(), IdxN, Scale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale,
+   ScaleFlags);
   }
 }
   }
 
-  N = DAG.getNode(ISD::ADD, dl,
-  N.getValueType(), N, IdxN);
+  SDNodeFlags AddFlags;
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in a signed sense (add
+  // nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+AddFlags.setNoSignedWrap(true);
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in an unsigned sense 
(add
+  // nuw).
+  if (NW.hasNoUnsignedWrap())
+AddFlags.setNoUnsignedWrap(true);
+
+  N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags);
 }
   }
 
diff --git a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll 
b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll
index 782894976c711c..a39afa6f609c7e 100644
--- a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll
+++ b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll
@@ -118,8 +118,7 @@ define void @gep_inbounds_nuw_alloca(i32 %idx, i32 %val) #0 
{
 ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32
 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0
-; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0
-; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen
+; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen offset:16
 ; GFX8-NEXT:s_waitcnt vmcnt(0)
 ; GFX8-NEXT:s_setpc_b64 s[30:31]
 ;
@@ -145,8 +144,7 @@ define void @gep_nusw_nuw_alloca(i32 %idx, i32 %val) #0 {
 ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32
 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0
-; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0
-; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen
+; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen offset:16
 ;

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/110815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread Nikita Popov via llvm-branch-commits



@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // it.
   IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType());
 
+  SDNodeFlags ScaleFlags;
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in a signed sense (mul nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+ScaleFlags.setNoSignedWrap(true);
+
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in an unsigned sense (mul nuw).
+  if (NW.hasNoUnsignedWrap())
+ScaleFlags.setNoUnsignedWrap(true);
+
   if (ElementScalable) {
 EVT VScaleTy = N.getValueType().getScalarType();
 SDValue VScale = DAG.getNode(
 ISD::VSCALE, dl, VScaleTy,
 DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy));
 if (IsVectorGEP)
   VScale = DAG.getSplatVector(N.getValueType(), dl, VScale);
-IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale,
+   ScaleFlags);
   } else {
 // If this is a multiply by a power of two, turn it into a shl
 // immediately.  This is a very common case.
 if (ElementMul != 1) {
   if (ElementMul.isPowerOf2()) {
 unsigned Amt = ElementMul.logBase2();
-IdxN = DAG.getNode(ISD::SHL, dl,
-   N.getValueType(), IdxN,
-   DAG.getConstant(Amt, dl, IdxN.getValueType()));
+IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN,
+   DAG.getConstant(Amt, dl, IdxN.getValueType()),
+   ScaleFlags);
   } else {
 SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl,
 IdxN.getValueType());
-IdxN = DAG.getNode(ISD::MUL, dl,
-   N.getValueType(), IdxN, Scale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale,
+   ScaleFlags);
   }
 }
   }
 
-  N = DAG.getNode(ISD::ADD, dl,
-  N.getValueType(), N, IdxN);
+  SDNodeFlags AddFlags;
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in a signed sense (add
+  // nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+AddFlags.setNoSignedWrap(true);

nikic wrote:

Adjust tests to have explicit nuw flag rather than only inbounds?

https://github.com/llvm/llvm-project/pull/110815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits



@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // it.
   IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType());
 
+  SDNodeFlags ScaleFlags;
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in a signed sense (mul nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+ScaleFlags.setNoSignedWrap(true);
+
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in an unsigned sense (mul nuw).
+  if (NW.hasNoUnsignedWrap())
+ScaleFlags.setNoUnsignedWrap(true);
+
   if (ElementScalable) {
 EVT VScaleTy = N.getValueType().getScalarType();
 SDValue VScale = DAG.getNode(
 ISD::VSCALE, dl, VScaleTy,
 DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy));
 if (IsVectorGEP)
   VScale = DAG.getSplatVector(N.getValueType(), dl, VScale);
-IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale,
+   ScaleFlags);
   } else {
 // If this is a multiply by a power of two, turn it into a shl
 // immediately.  This is a very common case.
 if (ElementMul != 1) {
   if (ElementMul.isPowerOf2()) {
 unsigned Amt = ElementMul.logBase2();
-IdxN = DAG.getNode(ISD::SHL, dl,
-   N.getValueType(), IdxN,
-   DAG.getConstant(Amt, dl, IdxN.getValueType()));
+IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN,
+   DAG.getConstant(Amt, dl, IdxN.getValueType()),
+   ScaleFlags);
   } else {
 SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl,
 IdxN.getValueType());
-IdxN = DAG.getNode(ISD::MUL, dl,
-   N.getValueType(), IdxN, Scale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale,
+   ScaleFlags);
   }
 }
   }
 
-  N = DAG.getNode(ISD::ADD, dl,
-  N.getValueType(), N, IdxN);
+  SDNodeFlags AddFlags;
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in a signed sense (add
+  // nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+AddFlags.setNoSignedWrap(true);

arsenm wrote:

That's already tested here: 
https://github.com/llvm/llvm-project/blob/56474dac206d8592229cb56e1f12b543ec97/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll#L134

But it's still not enough. computeKnownBits still can't prove the sign bit is 
zero during selection with all flags on both GEPs:

```
define void @gep_all_flags(i32 %idx, i32 %val) {
  %alloca = alloca [32 x i32], align 4, addrspace(5)
  %gep0 = getelementptr inbounds nuw [32 x i32], ptr addrspace(5) %alloca, i32 
0, i32 %idx
  %gep1 = getelementptr inbounds nuw i8, ptr addrspace(5) %gep0, i32 16
  store volatile i32 %val, ptr addrspace(5) %gep1, align 4
  ret void
}

```

```
Optimized legalized selection DAG: %bb.0 'gep_all_flags:'
SelectionDAG has 14 nodes:
  t0: ch,glue = EntryToken
  t4: i32,ch = CopyFromReg # D:1 t0, Register:i32 %8
t2: i32,ch = CopyFromReg # D:1 t0, Register:i32 %7
  t7: i32 = shl nuw nsw # D:1 t2, Constant:i32<2>
t8: i32 = add nuw # D:1 FrameIndex:i32<0>, t7
  t10: i32 = add nuw # D:1 t8, Constant:i32<16>
t13: ch = store<(volatile store (s32) into %ir.gep1, addrspace 5)> # D:1 
t0, t4, t10, undef:i32
  t14: ch = RET_GLUE t13
```



https://github.com/llvm/llvm-project/pull/110815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Custom expand flat cmpxchg which may access private (PR #109410)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits



@@ -43,7 +43,7 @@ define i64 @test_flat_atomicrmw_sub_0_i64_agent(ptr %ptr) {
 ; ALL:   [[ATOMICRMW_PRIVATE]]:
 ; ALL-NEXT:[[TMP1:%.*]] = addrspacecast ptr [[PTR]] to ptr addrspace(5)
 ; ALL-NEXT:[[LOADED_PRIVATE:%.*]] = load i64, ptr addrspace(5) [[TMP1]], 
align 8
-; ALL-NEXT:[[NEW:%.*]] = sub i64 [[LOADED_PRIVATE]], 0
+; ALL-NEXT:[[NEW:%.*]] = add i64 [[LOADED_PRIVATE]], 0

arsenm wrote:

Because it would require more work to avoid doing it, but there's not much 
reason to. 

All of the 64-bit cases  now go to expand. emitExpandAtomicRMW isn't bothering 
to restrict this to the specific cases where it's needed

https://github.com/llvm/llvm-project/pull/109410
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] fix RISCVPushPopOptimizer pass (PR #110813)

2024-10-02 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: None (dlav-sc)


Changes

RISCVPushPopOptimizer pass didn't suggest that CFI instructions can be inserted 
between cm.pop and ret and couldn't apply optimization in these cases. The 
patch fix it and allows the pass to remove CFI instructions and combine cm.pop 
and ret into the one instruction: cm.popret or cm.popretz.

---

Patch is 33.43 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/110813.diff


7 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp (+22-4) 
- (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+8-24) 
- (modified) llvm/test/CodeGen/RISCV/cm_mvas_mvsa.ll (+4-8) 
- (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+44-180) 
- (modified) llvm/test/CodeGen/RISCV/zcmp-additional-stack.ll (+1-6) 
- (modified) llvm/test/CodeGen/RISCV/zcmp-cm-popretz.mir (+4-22) 
- (modified) llvm/test/CodeGen/RISCV/zcmp-with-float.ll (+2-10) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp 
b/llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp
index 098e5bb5328bb3..2b1d6c25891a1f 100644
--- a/llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp
+++ b/llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp
@@ -45,10 +45,25 @@ char RISCVPushPopOpt::ID = 0;
 INITIALIZE_PASS(RISCVPushPopOpt, "riscv-push-pop-opt", RISCV_PUSH_POP_OPT_NAME,
 false, false)
 
+template 
+static IterT nextNoDebugNoCFIInst(const IterT It, const IterT End) {
+  return std::find_if_not(std::next(It), End, [](const auto &Inst) {
+return Inst.isDebugInstr() || Inst.isCFIInstruction() ||
+   Inst.isPseudoProbe();
+  });
+}
+
+static void eraseCFIInst(const MachineBasicBlock::iterator Begin,
+ const MachineBasicBlock::iterator End) {
+  for (auto &Inst : llvm::make_early_inc_range(llvm::make_range(Begin, End)))
+if (Inst.isCFIInstruction())
+  Inst.eraseFromParent();
+}
+
 // Check if POP instruction was inserted into the MBB and return iterator to 
it.
 static MachineBasicBlock::iterator containsPop(MachineBasicBlock &MBB) {
   for (MachineBasicBlock::iterator MBBI = MBB.begin(); MBBI != MBB.end();
-   MBBI = next_nodbg(MBBI, MBB.end()))
+   MBBI = nextNoDebugNoCFIInst(MBBI, MBB.end()))
 if (MBBI->getOpcode() == RISCV::CM_POP)
   return MBBI;
 
@@ -76,6 +91,9 @@ bool RISCVPushPopOpt::usePopRet(MachineBasicBlock::iterator 
&MBBI,
   for (unsigned i = FirstNonDeclaredOp; i < MBBI->getNumOperands(); ++i)
 PopRetBuilder.add(MBBI->getOperand(i));
 
+  // Remove CFI instructions, they are not needed for cm.popret and cm.popretz
+  eraseCFIInst(MBBI, NextI);
+
   MBBI->eraseFromParent();
   NextI->eraseFromParent();
   return true;
@@ -92,8 +110,8 @@ bool 
RISCVPushPopOpt::adjustRetVal(MachineBasicBlock::iterator &MBBI) {
   // Since POP instruction is in Epilogue no normal instructions will follow
   // after it. Therefore search only previous ones to find the return value.
   for (MachineBasicBlock::reverse_iterator I =
-   next_nodbg(MBBI.getReverse(), RE);
-   I != RE; I = next_nodbg(I, RE)) {
+   nextNoDebugNoCFIInst(MBBI.getReverse(), RE);
+   I != RE; I = nextNoDebugNoCFIInst(I, RE)) {
 MachineInstr &MI = *I;
 if (auto OperandPair = TII->isCopyInstrImpl(MI)) {
   Register DestReg = OperandPair->Destination->getReg();
@@ -138,7 +156,7 @@ bool RISCVPushPopOpt::runOnMachineFunction(MachineFunction 
&Fn) {
   bool Modified = false;
   for (auto &MBB : Fn) {
 MachineBasicBlock::iterator MBBI = containsPop(MBB);
-MachineBasicBlock::iterator NextI = next_nodbg(MBBI, MBB.end());
+MachineBasicBlock::iterator NextI = nextNoDebugNoCFIInst(MBBI, MBB.end());
 if (MBBI != MBB.end() && NextI != MBB.end() &&
 NextI->getOpcode() == RISCV::PseudoRET)
   Modified |= usePopRet(MBBI, NextI, adjustRetVal(MBBI));
diff --git a/llvm/test/CodeGen/RISCV/callee-saved-gprs.ll 
b/llvm/test/CodeGen/RISCV/callee-saved-gprs.ll
index 2a26602de9e1e7..528e52bad8ef1a 100644
--- a/llvm/test/CodeGen/RISCV/callee-saved-gprs.ll
+++ b/llvm/test/CodeGen/RISCV/callee-saved-gprs.ll
@@ -432,8 +432,7 @@ define void @callee() nounwind {
 ; RV32IZCMP-NEXT:sw a0, %lo(var+4)(t0)
 ; RV32IZCMP-NEXT:lw a0, 28(sp) # 4-byte Folded Reload
 ; RV32IZCMP-NEXT:sw a0, %lo(var)(t0)
-; RV32IZCMP-NEXT:cm.pop {ra, s0-s11}, 96
-; RV32IZCMP-NEXT:ret
+; RV32IZCMP-NEXT:cm.popret {ra, s0-s11}, 96
 ;
 ; RV32IZCMP-WITH-FP-LABEL: callee:
 ; RV32IZCMP-WITH-FP:   # %bb.0:
@@ -942,8 +941,7 @@ define void @callee() nounwind {
 ; RV64IZCMP-NEXT:sw a0, %lo(var+4)(t0)
 ; RV64IZCMP-NEXT:ld a0, 40(sp) # 8-byte Folded Reload
 ; RV64IZCMP-NEXT:sw a0, %lo(var)(t0)
-; RV64IZCMP-NEXT:cm.pop {ra, s0-s11}, 160
-; RV64IZCMP-NEXT:ret
+; RV64IZCMP-NEXT:cm.popret {ra, s0-s11}, 160
 ;
 ; RV64IZCMP-WITH-FP-LABEL: callee:
 ; RV64IZCMP-WITH-FP:   # %bb.0:
@@ -1613,8 +1611,7 @@ define

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread Nikita Popov via llvm-branch-commits



@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // it.
   IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType());
 
+  SDNodeFlags ScaleFlags;
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in a signed sense (mul nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+ScaleFlags.setNoSignedWrap(true);
+
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in an unsigned sense (mul nuw).
+  if (NW.hasNoUnsignedWrap())
+ScaleFlags.setNoUnsignedWrap(true);
+
   if (ElementScalable) {
 EVT VScaleTy = N.getValueType().getScalarType();
 SDValue VScale = DAG.getNode(
 ISD::VSCALE, dl, VScaleTy,
 DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy));
 if (IsVectorGEP)
   VScale = DAG.getSplatVector(N.getValueType(), dl, VScale);
-IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale,
+   ScaleFlags);
   } else {
 // If this is a multiply by a power of two, turn it into a shl
 // immediately.  This is a very common case.
 if (ElementMul != 1) {
   if (ElementMul.isPowerOf2()) {
 unsigned Amt = ElementMul.logBase2();
-IdxN = DAG.getNode(ISD::SHL, dl,
-   N.getValueType(), IdxN,
-   DAG.getConstant(Amt, dl, IdxN.getValueType()));
+IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN,
+   DAG.getConstant(Amt, dl, IdxN.getValueType()),
+   ScaleFlags);
   } else {
 SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl,
 IdxN.getValueType());
-IdxN = DAG.getNode(ISD::MUL, dl,
-   N.getValueType(), IdxN, Scale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale,
+   ScaleFlags);
   }
 }
   }
 
-  N = DAG.getNode(ISD::ADD, dl,
-  N.getValueType(), N, IdxN);
+  SDNodeFlags AddFlags;
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in a signed sense (add
+  // nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+AddFlags.setNoSignedWrap(true);
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in an unsigned sense 
(add
+  // nuw).

nikic wrote:

The more relevant wording here is:

> The successive addition of the current address, truncated to the pointer 
> index type and interpreted as an unsigned number, and each offset, also 
> interpreted as an unsigned number, does not wrap the pointer index type (add 
> nuw).

https://github.com/llvm/llvm-project/pull/110815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread Nikita Popov via llvm-branch-commits



@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // it.
   IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType());
 
+  SDNodeFlags ScaleFlags;
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in a signed sense (mul nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+ScaleFlags.setNoSignedWrap(true);
+
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in an unsigned sense (mul nuw).
+  if (NW.hasNoUnsignedWrap())
+ScaleFlags.setNoUnsignedWrap(true);
+
   if (ElementScalable) {
 EVT VScaleTy = N.getValueType().getScalarType();
 SDValue VScale = DAG.getNode(
 ISD::VSCALE, dl, VScaleTy,
 DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy));
 if (IsVectorGEP)
   VScale = DAG.getSplatVector(N.getValueType(), dl, VScale);
-IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale,
+   ScaleFlags);
   } else {
 // If this is a multiply by a power of two, turn it into a shl
 // immediately.  This is a very common case.
 if (ElementMul != 1) {
   if (ElementMul.isPowerOf2()) {
 unsigned Amt = ElementMul.logBase2();
-IdxN = DAG.getNode(ISD::SHL, dl,
-   N.getValueType(), IdxN,
-   DAG.getConstant(Amt, dl, IdxN.getValueType()));
+IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN,
+   DAG.getConstant(Amt, dl, IdxN.getValueType()),
+   ScaleFlags);
   } else {
 SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl,
 IdxN.getValueType());
-IdxN = DAG.getNode(ISD::MUL, dl,
-   N.getValueType(), IdxN, Scale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale,
+   ScaleFlags);
   }
 }
   }
 
-  N = DAG.getNode(ISD::ADD, dl,
-  N.getValueType(), N, IdxN);
+  SDNodeFlags AddFlags;
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in a signed sense (add
+  // nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+AddFlags.setNoSignedWrap(true);

nikic wrote:

This one looks incorrect. The add below is adding to the pointer (N), not just 
accumulating offsets, so you can't use nsw here. (The nuw below is correct.)

https://github.com/llvm/llvm-project/pull/110815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/110815

>From 56474dac206d8592229cb56e1f12b543ec97 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 2 Oct 2024 11:20:23 +0400
Subject: [PATCH 1/2] DAG: Preserve more flags when expanding gep

This allows selecting the addressing mode for stack instructions
in cases where we need to prove the sign bit is zero.
---
 .../SelectionDAG/SelectionDAGBuilder.cpp  | 41 +++
 .../CodeGen/AMDGPU/gep-flags-stack-offsets.ll |  6 +--
 .../pointer-add-unknown-offset-debug-info.ll  |  2 +-
 3 files changed, 36 insertions(+), 13 deletions(-)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 25213f587116d5..6838c0b530a363 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4386,6 +4386,17 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // it.
   IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType());
 
+  SDNodeFlags ScaleFlags;
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in a signed sense (mul nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+ScaleFlags.setNoSignedWrap(true);
+
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in an unsigned sense (mul nuw).
+  if (NW.hasNoUnsignedWrap())
+ScaleFlags.setNoUnsignedWrap(true);
+
   if (ElementScalable) {
 EVT VScaleTy = N.getValueType().getScalarType();
 SDValue VScale = DAG.getNode(
@@ -4393,27 +4404,41 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy));
 if (IsVectorGEP)
   VScale = DAG.getSplatVector(N.getValueType(), dl, VScale);
-IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale,
+   ScaleFlags);
   } else {
 // If this is a multiply by a power of two, turn it into a shl
 // immediately.  This is a very common case.
 if (ElementMul != 1) {
   if (ElementMul.isPowerOf2()) {
 unsigned Amt = ElementMul.logBase2();
-IdxN = DAG.getNode(ISD::SHL, dl,
-   N.getValueType(), IdxN,
-   DAG.getConstant(Amt, dl, IdxN.getValueType()));
+IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN,
+   DAG.getConstant(Amt, dl, IdxN.getValueType()),
+   ScaleFlags);
   } else {
 SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl,
 IdxN.getValueType());
-IdxN = DAG.getNode(ISD::MUL, dl,
-   N.getValueType(), IdxN, Scale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale,
+   ScaleFlags);
   }
 }
   }
 
-  N = DAG.getNode(ISD::ADD, dl,
-  N.getValueType(), N, IdxN);
+  SDNodeFlags AddFlags;
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in a signed sense (add
+  // nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+AddFlags.setNoSignedWrap(true);
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in an unsigned sense 
(add
+  // nuw).
+  if (NW.hasNoUnsignedWrap())
+AddFlags.setNoUnsignedWrap(true);
+
+  N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags);
 }
   }
 
diff --git a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll 
b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll
index 782894976c711c..a39afa6f609c7e 100644
--- a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll
+++ b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll
@@ -118,8 +118,7 @@ define void @gep_inbounds_nuw_alloca(i32 %idx, i32 %val) #0 
{
 ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32
 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0
-; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0
-; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen
+; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen offset:16
 ; GFX8-NEXT:s_waitcnt vmcnt(0)
 ; GFX8-NEXT:s_setpc_b64 s[30:31]
 ;
@@ -145,8 +144,7 @@ define void @gep_nusw_nuw_alloca(i32 %idx, i32 %val) #0 {
 ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32
 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0
-; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0
-; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offe

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits



@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // it.
   IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType());
 
+  SDNodeFlags ScaleFlags;
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in a signed sense (mul nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+ScaleFlags.setNoSignedWrap(true);
+
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in an unsigned sense (mul nuw).
+  if (NW.hasNoUnsignedWrap())
+ScaleFlags.setNoUnsignedWrap(true);
+
   if (ElementScalable) {
 EVT VScaleTy = N.getValueType().getScalarType();
 SDValue VScale = DAG.getNode(
 ISD::VSCALE, dl, VScaleTy,
 DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy));
 if (IsVectorGEP)
   VScale = DAG.getSplatVector(N.getValueType(), dl, VScale);
-IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale,
+   ScaleFlags);
   } else {
 // If this is a multiply by a power of two, turn it into a shl
 // immediately.  This is a very common case.
 if (ElementMul != 1) {
   if (ElementMul.isPowerOf2()) {
 unsigned Amt = ElementMul.logBase2();
-IdxN = DAG.getNode(ISD::SHL, dl,
-   N.getValueType(), IdxN,
-   DAG.getConstant(Amt, dl, IdxN.getValueType()));
+IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN,
+   DAG.getConstant(Amt, dl, IdxN.getValueType()),
+   ScaleFlags);
   } else {
 SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl,
 IdxN.getValueType());
-IdxN = DAG.getNode(ISD::MUL, dl,
-   N.getValueType(), IdxN, Scale);
+IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale,
+   ScaleFlags);
   }
 }
   }
 
-  N = DAG.getNode(ISD::ADD, dl,
-  N.getValueType(), N, IdxN);
+  SDNodeFlags AddFlags;
+
+  // The successive addition of each offset (without adding the base
+  // address) does not wrap the pointer index type in a signed sense (add
+  // nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+AddFlags.setNoSignedWrap(true);

arsenm wrote:

Loses the test benefit though 

https://github.com/llvm/llvm-project/pull/110815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Custom expand flat cmpxchg which may access private (PR #109410)

2024-10-02 Thread Pierre van Houtryve via llvm-branch-commits


https://github.com/Pierre-vh approved this pull request.


https://github.com/llvm/llvm-project/pull/109410
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm milestoned 
https://github.com/llvm/llvm-project/pull/110827
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/110827

This was using the default address space instead of the correct one.

Fixes #56055

Keep old method around for ABI compatibility on the release branch.

(cherry picked from commit 81ba95cefe1b5a12f0a7d8e6a383bcce9e95b785)

>From 53bc67c4a690ffdf7445d3d52af03d434f9fd52b Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 30 Sep 2024 13:43:53 +0400
Subject: [PATCH] FastISel: Fix incorrectly using getPointerTy (#110465)

This was using the default address space instead of the
correct one.

Fixes #56055

Keep old method around for ABI compatibility on the release branch.

(cherry picked from commit 81ba95cefe1b5a12f0a7d8e6a383bcce9e95b785)
---
 llvm/include/llvm/CodeGen/FastISel.h   |  7 +-
 llvm/lib/CodeGen/SelectionDAG/FastISel.cpp |  8 +--
 llvm/lib/Target/X86/X86FastISel.cpp|  4 +-
 llvm/test/CodeGen/X86/issue56055.ll| 81 ++
 4 files changed, 94 insertions(+), 6 deletions(-)
 create mode 100644 llvm/test/CodeGen/X86/issue56055.ll

diff --git a/llvm/include/llvm/CodeGen/FastISel.h 
b/llvm/include/llvm/CodeGen/FastISel.h
index 3cbc35400181dd..f91bd692accad8 100644
--- a/llvm/include/llvm/CodeGen/FastISel.h
+++ b/llvm/include/llvm/CodeGen/FastISel.h
@@ -275,7 +275,12 @@ class FastISel {
 
   /// This is a wrapper around getRegForValue that also takes care of
   /// truncating or sign-extending the given getelementptr index value.
-  Register getRegForGEPIndex(const Value *Idx);
+  Register getRegForGEPIndex(MVT PtrVT, const Value *Idx);
+
+  /// Retained for ABI compatibility in release branch.
+  Register getRegForGEPIndex(const Value *Idx) {
+return getRegForGEPIndex(TLI.getPointerTy(DL), Idx);
+  }
 
   /// We're checking to see if we can fold \p LI into \p FoldInst. Note
   /// that we could have a sequence where multiple LLVM IR instructions are
diff --git a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp 
b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
index ef9f7833551905..246acc7f405837 100644
--- a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
@@ -380,14 +380,13 @@ void FastISel::updateValueMap(const Value *I, Register 
Reg, unsigned NumRegs) {
   }
 }
 
-Register FastISel::getRegForGEPIndex(const Value *Idx) {
+Register FastISel::getRegForGEPIndex(MVT PtrVT, const Value *Idx) {
   Register IdxN = getRegForValue(Idx);
   if (!IdxN)
 // Unhandled operand. Halt "fast" selection and bail.
 return Register();
 
   // If the index is smaller or larger than intptr_t, truncate or extend it.
-  MVT PtrVT = TLI.getPointerTy(DL);
   EVT IdxVT = EVT::getEVT(Idx->getType(), /*HandleUnknown=*/false);
   if (IdxVT.bitsLT(PtrVT)) {
 IdxN = fastEmit_r(IdxVT.getSimpleVT(), PtrVT, ISD::SIGN_EXTEND, IdxN);
@@ -543,7 +542,8 @@ bool FastISel::selectGetElementPtr(const User *I) {
   uint64_t TotalOffs = 0;
   // FIXME: What's a good SWAG number for MaxOffs?
   uint64_t MaxOffs = 2048;
-  MVT VT = TLI.getPointerTy(DL);
+  MVT VT = TLI.getValueType(DL, I->getType()).getSimpleVT();
+
   for (gep_type_iterator GTI = gep_type_begin(I), E = gep_type_end(I);
GTI != E; ++GTI) {
 const Value *Idx = GTI.getOperand();
@@ -584,7 +584,7 @@ bool FastISel::selectGetElementPtr(const User *I) {
 
   // N = N + Idx * ElementSize;
   uint64_t ElementSize = GTI.getSequentialElementStride(DL);
-  Register IdxN = getRegForGEPIndex(Idx);
+  Register IdxN = getRegForGEPIndex(VT, Idx);
   if (!IdxN) // Unhandled operand. Halt "fast" selection and bail.
 return false;
 
diff --git a/llvm/lib/Target/X86/X86FastISel.cpp 
b/llvm/lib/Target/X86/X86FastISel.cpp
index 2eae155956368f..5d594bd54fbfc4 100644
--- a/llvm/lib/Target/X86/X86FastISel.cpp
+++ b/llvm/lib/Target/X86/X86FastISel.cpp
@@ -902,6 +902,8 @@ bool X86FastISel::X86SelectAddress(const Value *V, 
X86AddressMode &AM) {
 uint64_t Disp = (int32_t)AM.Disp;
 unsigned IndexReg = AM.IndexReg;
 unsigned Scale = AM.Scale;
+MVT PtrVT = TLI.getValueType(DL, U->getType()).getSimpleVT();
+
 gep_type_iterator GTI = gep_type_begin(U);
 // Iterate through the indices, folding what we can. Constants can be
 // folded, and one dynamic index can be handled, if the scale is supported.
@@ -937,7 +939,7 @@ bool X86FastISel::X86SelectAddress(const Value *V, 
X86AddressMode &AM) {
 (S == 1 || S == 2 || S == 4 || S == 8)) {
   // Scaled-index addressing.
   Scale = S;
-  IndexReg = getRegForGEPIndex(Op);
+  IndexReg = getRegForGEPIndex(PtrVT, Op);
   if (IndexReg == 0)
 return false;
   break;
diff --git a/llvm/test/CodeGen/X86/issue56055.ll 
b/llvm/test/CodeGen/X86/issue56055.ll
new file mode 100644
index 00..27eaf13e3b00be
--- /dev/null
+++ b/llvm/test/CodeGen/X86/issue56055.ll
@@ -0,0 +1,81 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS:

[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm edited 
https://github.com/llvm/llvm-project/pull/110827
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110490)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm closed 
https://github.com/llvm/llvm-project/pull/110490
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110490)

2024-10-02 Thread Matt Arsenault via llvm-branch-commits


arsenm wrote:

New version in #110827

https://github.com/llvm/llvm-project/pull/110490
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)

2024-10-02 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-selectiondag

Author: Matt Arsenault (arsenm)


Changes

This was using the default address space instead of the correct one.

Fixes #56055

Keep old method around for ABI compatibility on the release branch.

(cherry picked from commit 81ba95cefe1b5a12f0a7d8e6a383bcce9e95b785)

---
Full diff: https://github.com/llvm/llvm-project/pull/110827.diff


4 Files Affected:

- (modified) llvm/include/llvm/CodeGen/FastISel.h (+6-1) 
- (modified) llvm/lib/CodeGen/SelectionDAG/FastISel.cpp (+4-4) 
- (modified) llvm/lib/Target/X86/X86FastISel.cpp (+3-1) 
- (added) llvm/test/CodeGen/X86/issue56055.ll (+81) 


``diff
diff --git a/llvm/include/llvm/CodeGen/FastISel.h 
b/llvm/include/llvm/CodeGen/FastISel.h
index 3cbc35400181dd..f91bd692accad8 100644
--- a/llvm/include/llvm/CodeGen/FastISel.h
+++ b/llvm/include/llvm/CodeGen/FastISel.h
@@ -275,7 +275,12 @@ class FastISel {
 
   /// This is a wrapper around getRegForValue that also takes care of
   /// truncating or sign-extending the given getelementptr index value.
-  Register getRegForGEPIndex(const Value *Idx);
+  Register getRegForGEPIndex(MVT PtrVT, const Value *Idx);
+
+  /// Retained for ABI compatibility in release branch.
+  Register getRegForGEPIndex(const Value *Idx) {
+return getRegForGEPIndex(TLI.getPointerTy(DL), Idx);
+  }
 
   /// We're checking to see if we can fold \p LI into \p FoldInst. Note
   /// that we could have a sequence where multiple LLVM IR instructions are
diff --git a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp 
b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
index ef9f7833551905..246acc7f405837 100644
--- a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
@@ -380,14 +380,13 @@ void FastISel::updateValueMap(const Value *I, Register 
Reg, unsigned NumRegs) {
   }
 }
 
-Register FastISel::getRegForGEPIndex(const Value *Idx) {
+Register FastISel::getRegForGEPIndex(MVT PtrVT, const Value *Idx) {
   Register IdxN = getRegForValue(Idx);
   if (!IdxN)
 // Unhandled operand. Halt "fast" selection and bail.
 return Register();
 
   // If the index is smaller or larger than intptr_t, truncate or extend it.
-  MVT PtrVT = TLI.getPointerTy(DL);
   EVT IdxVT = EVT::getEVT(Idx->getType(), /*HandleUnknown=*/false);
   if (IdxVT.bitsLT(PtrVT)) {
 IdxN = fastEmit_r(IdxVT.getSimpleVT(), PtrVT, ISD::SIGN_EXTEND, IdxN);
@@ -543,7 +542,8 @@ bool FastISel::selectGetElementPtr(const User *I) {
   uint64_t TotalOffs = 0;
   // FIXME: What's a good SWAG number for MaxOffs?
   uint64_t MaxOffs = 2048;
-  MVT VT = TLI.getPointerTy(DL);
+  MVT VT = TLI.getValueType(DL, I->getType()).getSimpleVT();
+
   for (gep_type_iterator GTI = gep_type_begin(I), E = gep_type_end(I);
GTI != E; ++GTI) {
 const Value *Idx = GTI.getOperand();
@@ -584,7 +584,7 @@ bool FastISel::selectGetElementPtr(const User *I) {
 
   // N = N + Idx * ElementSize;
   uint64_t ElementSize = GTI.getSequentialElementStride(DL);
-  Register IdxN = getRegForGEPIndex(Idx);
+  Register IdxN = getRegForGEPIndex(VT, Idx);
   if (!IdxN) // Unhandled operand. Halt "fast" selection and bail.
 return false;
 
diff --git a/llvm/lib/Target/X86/X86FastISel.cpp 
b/llvm/lib/Target/X86/X86FastISel.cpp
index 2eae155956368f..5d594bd54fbfc4 100644
--- a/llvm/lib/Target/X86/X86FastISel.cpp
+++ b/llvm/lib/Target/X86/X86FastISel.cpp
@@ -902,6 +902,8 @@ bool X86FastISel::X86SelectAddress(const Value *V, 
X86AddressMode &AM) {
 uint64_t Disp = (int32_t)AM.Disp;
 unsigned IndexReg = AM.IndexReg;
 unsigned Scale = AM.Scale;
+MVT PtrVT = TLI.getValueType(DL, U->getType()).getSimpleVT();
+
 gep_type_iterator GTI = gep_type_begin(U);
 // Iterate through the indices, folding what we can. Constants can be
 // folded, and one dynamic index can be handled, if the scale is supported.
@@ -937,7 +939,7 @@ bool X86FastISel::X86SelectAddress(const Value *V, 
X86AddressMode &AM) {
 (S == 1 || S == 2 || S == 4 || S == 8)) {
   // Scaled-index addressing.
   Scale = S;
-  IndexReg = getRegForGEPIndex(Op);
+  IndexReg = getRegForGEPIndex(PtrVT, Op);
   if (IndexReg == 0)
 return false;
   break;
diff --git a/llvm/test/CodeGen/X86/issue56055.ll 
b/llvm/test/CodeGen/X86/issue56055.ll
new file mode 100644
index 00..27eaf13e3b00be
--- /dev/null
+++ b/llvm/test/CodeGen/X86/issue56055.ll
@@ -0,0 +1,81 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc -fast-isel < %s | FileCheck -check-prefixes=CHECK,FASTISEL %s
+; RUN: llc < %s | FileCheck -check-prefixes=CHECK,SDAG %s
+
+target datalayout = 
"e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-windows-msvc"
+
+define void @issue56055(ptr addrspace(270) %ptr, ptr %out) {
+; CHECK-LABEL: issue56055:
+; CHECK:   # %bb.0:
+; CH

[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)

2024-10-02 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-x86

Author: Matt Arsenault (arsenm)


Changes

This was using the default address space instead of the correct one.

Fixes #56055

Keep old method around for ABI compatibility on the release branch.

(cherry picked from commit 81ba95cefe1b5a12f0a7d8e6a383bcce9e95b785)

---
Full diff: https://github.com/llvm/llvm-project/pull/110827.diff


4 Files Affected:

- (modified) llvm/include/llvm/CodeGen/FastISel.h (+6-1) 
- (modified) llvm/lib/CodeGen/SelectionDAG/FastISel.cpp (+4-4) 
- (modified) llvm/lib/Target/X86/X86FastISel.cpp (+3-1) 
- (added) llvm/test/CodeGen/X86/issue56055.ll (+81) 


``diff
diff --git a/llvm/include/llvm/CodeGen/FastISel.h 
b/llvm/include/llvm/CodeGen/FastISel.h
index 3cbc35400181dd..f91bd692accad8 100644
--- a/llvm/include/llvm/CodeGen/FastISel.h
+++ b/llvm/include/llvm/CodeGen/FastISel.h
@@ -275,7 +275,12 @@ class FastISel {
 
   /// This is a wrapper around getRegForValue that also takes care of
   /// truncating or sign-extending the given getelementptr index value.
-  Register getRegForGEPIndex(const Value *Idx);
+  Register getRegForGEPIndex(MVT PtrVT, const Value *Idx);
+
+  /// Retained for ABI compatibility in release branch.
+  Register getRegForGEPIndex(const Value *Idx) {
+return getRegForGEPIndex(TLI.getPointerTy(DL), Idx);
+  }
 
   /// We're checking to see if we can fold \p LI into \p FoldInst. Note
   /// that we could have a sequence where multiple LLVM IR instructions are
diff --git a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp 
b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
index ef9f7833551905..246acc7f405837 100644
--- a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
@@ -380,14 +380,13 @@ void FastISel::updateValueMap(const Value *I, Register 
Reg, unsigned NumRegs) {
   }
 }
 
-Register FastISel::getRegForGEPIndex(const Value *Idx) {
+Register FastISel::getRegForGEPIndex(MVT PtrVT, const Value *Idx) {
   Register IdxN = getRegForValue(Idx);
   if (!IdxN)
 // Unhandled operand. Halt "fast" selection and bail.
 return Register();
 
   // If the index is smaller or larger than intptr_t, truncate or extend it.
-  MVT PtrVT = TLI.getPointerTy(DL);
   EVT IdxVT = EVT::getEVT(Idx->getType(), /*HandleUnknown=*/false);
   if (IdxVT.bitsLT(PtrVT)) {
 IdxN = fastEmit_r(IdxVT.getSimpleVT(), PtrVT, ISD::SIGN_EXTEND, IdxN);
@@ -543,7 +542,8 @@ bool FastISel::selectGetElementPtr(const User *I) {
   uint64_t TotalOffs = 0;
   // FIXME: What's a good SWAG number for MaxOffs?
   uint64_t MaxOffs = 2048;
-  MVT VT = TLI.getPointerTy(DL);
+  MVT VT = TLI.getValueType(DL, I->getType()).getSimpleVT();
+
   for (gep_type_iterator GTI = gep_type_begin(I), E = gep_type_end(I);
GTI != E; ++GTI) {
 const Value *Idx = GTI.getOperand();
@@ -584,7 +584,7 @@ bool FastISel::selectGetElementPtr(const User *I) {
 
   // N = N + Idx * ElementSize;
   uint64_t ElementSize = GTI.getSequentialElementStride(DL);
-  Register IdxN = getRegForGEPIndex(Idx);
+  Register IdxN = getRegForGEPIndex(VT, Idx);
   if (!IdxN) // Unhandled operand. Halt "fast" selection and bail.
 return false;
 
diff --git a/llvm/lib/Target/X86/X86FastISel.cpp 
b/llvm/lib/Target/X86/X86FastISel.cpp
index 2eae155956368f..5d594bd54fbfc4 100644
--- a/llvm/lib/Target/X86/X86FastISel.cpp
+++ b/llvm/lib/Target/X86/X86FastISel.cpp
@@ -902,6 +902,8 @@ bool X86FastISel::X86SelectAddress(const Value *V, 
X86AddressMode &AM) {
 uint64_t Disp = (int32_t)AM.Disp;
 unsigned IndexReg = AM.IndexReg;
 unsigned Scale = AM.Scale;
+MVT PtrVT = TLI.getValueType(DL, U->getType()).getSimpleVT();
+
 gep_type_iterator GTI = gep_type_begin(U);
 // Iterate through the indices, folding what we can. Constants can be
 // folded, and one dynamic index can be handled, if the scale is supported.
@@ -937,7 +939,7 @@ bool X86FastISel::X86SelectAddress(const Value *V, 
X86AddressMode &AM) {
 (S == 1 || S == 2 || S == 4 || S == 8)) {
   // Scaled-index addressing.
   Scale = S;
-  IndexReg = getRegForGEPIndex(Op);
+  IndexReg = getRegForGEPIndex(PtrVT, Op);
   if (IndexReg == 0)
 return false;
   break;
diff --git a/llvm/test/CodeGen/X86/issue56055.ll 
b/llvm/test/CodeGen/X86/issue56055.ll
new file mode 100644
index 00..27eaf13e3b00be
--- /dev/null
+++ b/llvm/test/CodeGen/X86/issue56055.ll
@@ -0,0 +1,81 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc -fast-isel < %s | FileCheck -check-prefixes=CHECK,FASTISEL %s
+; RUN: llc < %s | FileCheck -check-prefixes=CHECK,SDAG %s
+
+target datalayout = 
"e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-windows-msvc"
+
+define void @issue56055(ptr addrspace(270) %ptr, ptr %out) {
+; CHECK-LABEL: issue56055:
+; CHECK:   # %bb.0:
+; CHECK-NE

[llvm-branch-commits] [llvm] [AArch64][PAC] Move emission of LR checks in tail calls to AsmPrinter (PR #110705)

2024-10-02 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/110705

>From 089cc13bbd2cac76a2d3fc0b2f72b0bccda5b188 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 23 Sep 2024 19:51:55 +0300
Subject: [PATCH] [AArch64][PAC] Move emission of LR checks in tail calls to
 AsmPrinter

Move the emission of the checks performed on the authenticated LR value
during tail calls to AArch64AsmPrinter class, so that different checker
sequences can be reused by pseudo instructions expanded there.
This adds one more option to AuthCheckMethod enumeration, the generic
XPAC variant which is not restricted to checking the LR register.
---
 llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp | 143 +++---
 llvm/lib/Target/AArch64/AArch64InstrInfo.cpp  |  13 ++
 llvm/lib/Target/AArch64/AArch64InstrInfo.td   |   2 +
 .../lib/Target/AArch64/AArch64PointerAuth.cpp | 182 +-
 llvm/lib/Target/AArch64/AArch64PointerAuth.h  |  40 ++--
 llvm/lib/Target/AArch64/AArch64Subtarget.cpp  |   2 -
 llvm/lib/Target/AArch64/AArch64Subtarget.h|  23 ---
 llvm/test/CodeGen/AArch64/ptrauth-ret-trap.ll |  36 ++--
 .../AArch64/sign-return-address-tailcall.ll   |  54 +++---
 9 files changed, 192 insertions(+), 303 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp 
b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 6d2dd0ecbccf31..50502477706ccf 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -153,6 +153,7 @@ class AArch64AsmPrinter : public AsmPrinter {
   void emitPtrauthCheckAuthenticatedValue(Register TestedReg,
   Register ScratchReg,
   AArch64PACKey::ID Key,
+  AArch64PAuth::AuthCheckMethod Method,
   bool ShouldTrap,
   const MCSymbol *OnFailure);
 
@@ -1731,7 +1732,8 @@ unsigned 
AArch64AsmPrinter::emitPtrauthDiscriminator(uint16_t Disc,
 /// of proceeding to the next instruction (only if ShouldTrap is false).
 void AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue(
 Register TestedReg, Register ScratchReg, AArch64PACKey::ID Key,
-bool ShouldTrap, const MCSymbol *OnFailure) {
+AArch64PAuth::AuthCheckMethod Method, bool ShouldTrap,
+const MCSymbol *OnFailure) {
   // Insert a sequence to check if authentication of TestedReg succeeded,
   // such as:
   //
@@ -1757,38 +1759,70 @@ void 
AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue(
   //Lsuccess:
   //  ...
   //
-  // This sequence is expensive, but we need more information to be able to
-  // do better.
-  //
-  // We can't TBZ the poison bit because EnhancedPAC2 XORs the PAC bits
-  // on failure.
-  // We can't TST the PAC bits because we don't always know how the address
-  // space is setup for the target environment (and the bottom PAC bit is
-  // based on that).
-  // Either way, we also don't always know whether TBI is enabled or not for
-  // the specific target environment.
+  // See the documentation on AuthCheckMethod enumeration constants for
+  // the specific code sequences that can be used to perform the check.
+  using AArch64PAuth::AuthCheckMethod;
 
-  unsigned XPACOpc = getXPACOpcodeForKey(Key);
+  if (Method == AuthCheckMethod::None)
+return;
+  if (Method == AuthCheckMethod::DummyLoad) {
+EmitToStreamer(MCInstBuilder(AArch64::LDRWui)
+   .addReg(getWRegFromXReg(ScratchReg))
+   .addReg(TestedReg)
+   .addImm(0));
+assert(ShouldTrap && !OnFailure && "DummyLoad always traps on error");
+return;
+  }
 
   MCSymbol *SuccessSym = createTempSymbol("auth_success_");
+  if (Method == AuthCheckMethod::XPAC || Method == AuthCheckMethod::XPACHint) {
+//  mov Xscratch, Xtested
+emitMovXReg(ScratchReg, TestedReg);
 
-  //  mov Xscratch, Xtested
-  emitMovXReg(ScratchReg, TestedReg);
-
-  //  xpac(i|d) Xscratch
-  EmitToStreamer(MCInstBuilder(XPACOpc).addReg(ScratchReg).addReg(ScratchReg));
+if (Method == AuthCheckMethod::XPAC) {
+  //  xpac(i|d) Xscratch
+  unsigned XPACOpc = getXPACOpcodeForKey(Key);
+  EmitToStreamer(
+  MCInstBuilder(XPACOpc).addReg(ScratchReg).addReg(ScratchReg));
+} else {
+  //  xpaclri
+
+  // Note that this method applies XPAC to TestedReg instead of ScratchReg.
+  assert(TestedReg == AArch64::LR &&
+ "XPACHint mode is only compatible with checking the LR register");
+  assert((Key == AArch64PACKey::IA || Key == AArch64PACKey::IB) &&
+ "XPACHint mode is only compatible with I-keys");
+  EmitToStreamer(MCInstBuilder(AArch64::XPACLRI));
+}
 
-  //  cmp Xtested, Xscratch
-  EmitToStreamer(MCInstBuilder(AArch64::SUBSXrs)
- .addReg(AArch64::XZR)
- .addReg(TestedReg)
- .addR

[llvm-branch-commits] [llvm] [AArch64][PAC] Move emission of LR checks in tail calls to AsmPrinter (PR #110705)

2024-10-02 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/110705
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)

2024-10-02 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-debuginfo

Author: None (dlav-sc)


Changes

This patch adds CFI instructions in a function epilogue, that allows lldb to 
obtain a valid backtrace at the end of functions.

---

Patch is 1004.05 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/110810.diff


296 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+72) 
- (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.h (+2) 
- (modified) llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll (+10) 
- (modified) llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll (+36) 
- (modified) llvm/test/CodeGen/RISCV/addrspacecast.ll (+4) 
- (modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+85) 
- (modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+85) 
- (modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+144) 
- (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+88-8) 
- (modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32e.ll (+271) 
- (modified) llvm/test/CodeGen/RISCV/cm_mvas_mvsa.ll (+8-4) 
- (modified) llvm/test/CodeGen/RISCV/double-intrinsics.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/double-round-conv.ll (+50) 
- (modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll 
(+2) 
- (modified) llvm/test/CodeGen/RISCV/eh-dwarf-cfa.ll (+4) 
- (modified) llvm/test/CodeGen/RISCV/exception-pointer-register.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/float-intrinsics.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/float-round-conv.ll (+40) 
- (modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+176) 
- (modified) llvm/test/CodeGen/RISCV/frame-info.ll (+32) 
- (modified) llvm/test/CodeGen/RISCV/half-convert-strict.ll (+10) 
- (modified) llvm/test/CodeGen/RISCV/half-intrinsics.ll (+20) 
- (modified) llvm/test/CodeGen/RISCV/half-round-conv.ll (+80) 
- (modified) llvm/test/CodeGen/RISCV/hwasan-check-memaccess.ll (+2) 
- (modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts-vscale.ll (+3) 
- (modified) llvm/test/CodeGen/RISCV/kcfi-mir.ll (+2) 
- (modified) llvm/test/CodeGen/RISCV/large-stack.ll (+15) 
- (modified) llvm/test/CodeGen/RISCV/live-sp.mir (+2-1) 
- (modified) llvm/test/CodeGen/RISCV/llvm.exp10.ll (+120) 
- (modified) llvm/test/CodeGen/RISCV/local-stack-slot-allocation.ll (+14) 
- (modified) llvm/test/CodeGen/RISCV/lpad.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/miss-sp-restore-eh.ll (+5) 
- (modified) llvm/test/CodeGen/RISCV/nontemporal.ll (+60) 
- (modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+23) 
- (modified) llvm/test/CodeGen/RISCV/pr58025.ll (+1) 
- (modified) llvm/test/CodeGen/RISCV/pr58286.ll (+4) 
- (modified) llvm/test/CodeGen/RISCV/pr63365.ll (+1) 
- (modified) llvm/test/CodeGen/RISCV/pr69586.ll (+29) 
- (modified) llvm/test/CodeGen/RISCV/pr88365.ll (+3) 
- (modified) llvm/test/CodeGen/RISCV/prolog-epilogue.ll (+80) 
- (modified) llvm/test/CodeGen/RISCV/push-pop-opt-crash.ll (+27-25) 
- (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+344-44) 
- (modified) llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll 
(+8) 
- (modified) llvm/test/CodeGen/RISCV/rv64-patchpoint.ll (+3) 
- (modified) llvm/test/CodeGen/RISCV/rv64-statepoint-call-lowering.ll (+21) 
- (modified) llvm/test/CodeGen/RISCV/rvv-cfi-info.ll (+18) 
- (modified) llvm/test/CodeGen/RISCV/rvv/abs-vp.ll (+2) 
- (modified) llvm/test/CodeGen/RISCV/rvv/access-fixed-objects-by-rvv.ll (+3) 
- (modified) llvm/test/CodeGen/RISCV/rvv/addi-scalable-offset.mir (+4) 
- (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-array.ll 
(+2) 
- (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-struct.ll 
(+2) 
- (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-vector-tuple.ll (+6) 
- (modified) llvm/test/CodeGen/RISCV/rvv/binop-splats.ll (+5) 
- (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-sdnode.ll (+5) 
- (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+20) 
- (modified) llvm/test/CodeGen/RISCV/rvv/bswap-sdnode.ll (+5) 
- (modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+21) 
- (modified) llvm/test/CodeGen/RISCV/rvv/callee-saved-regs.ll (+4) 
- (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll (+40) 
- (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv.ll (+16) 
- (modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+10) 
- (modified) llvm/test/CodeGen/RISCV/rvv/compressstore.ll (+2) 
- (modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+6) 
- (modified) llvm/test/CodeGen/RISCV/rvv/cttz-vp.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/rvv/emergency-slot.mir (+14) 
- (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll (+16) 
- (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv32.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv64.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vector-i8-index-cornercase.ll 
(+8) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-binop-splats.ll (+4) 
- (modified)

[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)

2024-10-02 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: None (dlav-sc)


Changes

This patch adds CFI instructions in a function epilogue, that allows lldb to 
obtain a valid backtrace at the end of functions.

---

Patch is 1004.05 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/110810.diff


296 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+72) 
- (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.h (+2) 
- (modified) llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll (+10) 
- (modified) llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll (+36) 
- (modified) llvm/test/CodeGen/RISCV/addrspacecast.ll (+4) 
- (modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+85) 
- (modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+85) 
- (modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+144) 
- (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+88-8) 
- (modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32e.ll (+271) 
- (modified) llvm/test/CodeGen/RISCV/cm_mvas_mvsa.ll (+8-4) 
- (modified) llvm/test/CodeGen/RISCV/double-intrinsics.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/double-round-conv.ll (+50) 
- (modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll 
(+2) 
- (modified) llvm/test/CodeGen/RISCV/eh-dwarf-cfa.ll (+4) 
- (modified) llvm/test/CodeGen/RISCV/exception-pointer-register.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/float-intrinsics.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/float-round-conv.ll (+40) 
- (modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+176) 
- (modified) llvm/test/CodeGen/RISCV/frame-info.ll (+32) 
- (modified) llvm/test/CodeGen/RISCV/half-convert-strict.ll (+10) 
- (modified) llvm/test/CodeGen/RISCV/half-intrinsics.ll (+20) 
- (modified) llvm/test/CodeGen/RISCV/half-round-conv.ll (+80) 
- (modified) llvm/test/CodeGen/RISCV/hwasan-check-memaccess.ll (+2) 
- (modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts-vscale.ll (+3) 
- (modified) llvm/test/CodeGen/RISCV/kcfi-mir.ll (+2) 
- (modified) llvm/test/CodeGen/RISCV/large-stack.ll (+15) 
- (modified) llvm/test/CodeGen/RISCV/live-sp.mir (+2-1) 
- (modified) llvm/test/CodeGen/RISCV/llvm.exp10.ll (+120) 
- (modified) llvm/test/CodeGen/RISCV/local-stack-slot-allocation.ll (+14) 
- (modified) llvm/test/CodeGen/RISCV/lpad.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/miss-sp-restore-eh.ll (+5) 
- (modified) llvm/test/CodeGen/RISCV/nontemporal.ll (+60) 
- (modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+23) 
- (modified) llvm/test/CodeGen/RISCV/pr58025.ll (+1) 
- (modified) llvm/test/CodeGen/RISCV/pr58286.ll (+4) 
- (modified) llvm/test/CodeGen/RISCV/pr63365.ll (+1) 
- (modified) llvm/test/CodeGen/RISCV/pr69586.ll (+29) 
- (modified) llvm/test/CodeGen/RISCV/pr88365.ll (+3) 
- (modified) llvm/test/CodeGen/RISCV/prolog-epilogue.ll (+80) 
- (modified) llvm/test/CodeGen/RISCV/push-pop-opt-crash.ll (+27-25) 
- (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+344-44) 
- (modified) llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll 
(+8) 
- (modified) llvm/test/CodeGen/RISCV/rv64-patchpoint.ll (+3) 
- (modified) llvm/test/CodeGen/RISCV/rv64-statepoint-call-lowering.ll (+21) 
- (modified) llvm/test/CodeGen/RISCV/rvv-cfi-info.ll (+18) 
- (modified) llvm/test/CodeGen/RISCV/rvv/abs-vp.ll (+2) 
- (modified) llvm/test/CodeGen/RISCV/rvv/access-fixed-objects-by-rvv.ll (+3) 
- (modified) llvm/test/CodeGen/RISCV/rvv/addi-scalable-offset.mir (+4) 
- (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-array.ll 
(+2) 
- (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-struct.ll 
(+2) 
- (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-vector-tuple.ll (+6) 
- (modified) llvm/test/CodeGen/RISCV/rvv/binop-splats.ll (+5) 
- (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-sdnode.ll (+5) 
- (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+20) 
- (modified) llvm/test/CodeGen/RISCV/rvv/bswap-sdnode.ll (+5) 
- (modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+21) 
- (modified) llvm/test/CodeGen/RISCV/rvv/callee-saved-regs.ll (+4) 
- (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll (+40) 
- (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv.ll (+16) 
- (modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+10) 
- (modified) llvm/test/CodeGen/RISCV/rvv/compressstore.ll (+2) 
- (modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+6) 
- (modified) llvm/test/CodeGen/RISCV/rvv/cttz-vp.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/rvv/emergency-slot.mir (+14) 
- (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll (+16) 
- (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv32.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv64.ll (+8) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vector-i8-index-cornercase.ll 
(+8) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-binop-splats.ll (+4) 
- (modif

[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)

2024-10-02 Thread via llvm-branch-commits


github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff ac123f934ecb8cc840f6ad33739a03c64ac351ca 
e6f1c940894489859e75b944978d42fcdffdec8e --extensions cpp,h -- 
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp 
llvm/lib/Target/RISCV/RISCVFrameLowering.h 
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
``





View the diff from clang-format here.


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
index 8254133d0e..66a693277a 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
@@ -896,7 +896,8 @@ void RISCVFrameLowering::emitEpilogue(MachineFunction &MF,
 // therefor it is unecessary to place any CFI instructions after it. Just
 // deallocate stack if needed and return.
 if (StackSize != 0)
-  deallocateStack(MF, MBB, MBBI, DL, StackSize, 
RVFI->getLibCallStackSize());
+  deallocateStack(MF, MBB, MBBI, DL, StackSize,
+  RVFI->getLibCallStackSize());
 
 // Emit epilogue for shadow call stack.
 emitSCSEpilogue(MF, MBB, MBBI, DL);

``




https://github.com/llvm/llvm-project/pull/110810
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)

2024-10-02 Thread via llvm-branch-commits



@@ -62,6 +69,8 @@ define i32 @callee_float_in_regs(i32 %a, float %b) {
 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:mv a0, a1
 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:call __fixsfsi
 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:add a0, s0, a0
+; ILP32E-FPELIM-SAVE-RESTORE-NEXT:.cfi_restore ra
+; ILP32E-FPELIM-SAVE-RESTORE-NEXT:.cfi_restore s0
 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:tail __riscv_restore_1

dlav-sc wrote:

addressed.

Yeah, I didn't take into account libcalls at all :) Anyway, I made a fix 
(5cf8f319fbc5304c21e05d5ec3d2aff1713bd071) and updated tests 
(e6f1c940894489859e75b944978d42fcdffdec8e). Could you take a look, please? 

However, tests don't look like you've expected, because in fact it is 
unnecessary to emit CFI instructions after `tail __riscv_restore_1`, which is 
considered as a terminator. Therefor, I just removed `.cfi_restore` and placed 
`.cfi_def_cfa_offset` instructions where they should be.

https://github.com/llvm/llvm-project/pull/110810
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] workflows/release-binaries: Use static ZSTD on macOS (PR #110701)

2024-10-02 Thread Keith Smiley via llvm-branch-commits


keith wrote:

Is the process for this that you will merge it at some point? Mainly wondering 
if I need to keep following it to make sure it makes it for the next one

https://github.com/llvm/llvm-project/pull/110701
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] bdbe9f3 - Revert "[MLIR][XeGPU] Add sg_map attribute to support Work Item level semanti…"

2024-10-02 Thread via llvm-branch-commits


Author: Chao Chen
Date: 2024-10-02T10:44:00-05:00
New Revision: bdbe9f3fcc6e201eb2ec0340a80e6fd0fea8265a

URL: 
https://github.com/llvm/llvm-project/commit/bdbe9f3fcc6e201eb2ec0340a80e6fd0fea8265a
DIFF: 
https://github.com/llvm/llvm-project/commit/bdbe9f3fcc6e201eb2ec0340a80e6fd0fea8265a.diff

LOG: Revert "[MLIR][XeGPU] Add sg_map attribute to support Work Item level 
semanti…"

This reverts commit 3ca5d8082a0c6bd9520544ce3bca11bf3e02a5fa.

Added: 


Modified: 
mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td
mlir/lib/Dialect/XeGPU/IR/XeGPUDialect.cpp
mlir/test/Dialect/XeGPU/XeGPUOps.mlir

Removed: 




diff  --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
index 2aaa7fd4221ab1..26eec0d4f2082a 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
@@ -142,36 +142,4 @@ def XeGPU_FenceScopeAttr:
 let assemblyFormat = "$value";
 }
 
-def XeGPU_SGMapAttr : XeGPUAttr<"SGMap", "sg_map"> {
-  let summary = [{
-Describes the mapping between work item (WI) and the 2D tensor specified 
by the tensor descriptor.
-  }];
-  let description = [{
-To distribute the XeGPU operation to work items, the tensor_desc must be 
specified with the sg_map
-attribute at the tensor description creation time.
-Within the `sg_map`, `wi_layout` specifies the layout of work items,
-describing the mapping of work items to the tensor.
-wi_layout[0] x wi_layout[1] must be equal to the total number of work 
items within a subgroup.
-`wi_data` specifies the minimum number of data elements assigned to each 
work item for a single distribution.
-
-E.g., #xegpu.sg_map
-In this example, the subgroup has 16 work items in wi_layout=[1, 16],
-each accessing 1 element as specified by wi_data=[1, 1].
-
-`wi_data[0] * wi_data[1]` can be greater than 1, meaning that each work 
item operates on multiple elements,
-which is eventually lowered to "SIMT-flavor" vector, like SPIR-V vector or 
llvm vector, or packed to a storage data type.
-The multiple elements indicated by `wi_data` can only be from one 
dimension and must be contiguous in the memory along either dimension.
-  }];
-  let parameters = (ins
-ArrayRefParameter<"uint32_t">:$wi_layout,
-ArrayRefParameter<"uint32_t">:$wi_data);
-
-  let builders = [
-AttrBuilder<(ins)>
-  ];
-
-  let hasCustomAssemblyFormat = 1;
-  let genVerifyDecl = 1;
-}
-
 #endif // MLIR_DIALECT_XEGPU_IR_XEGPUATTRS_TD

diff  --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td
index 9f1b17721f2d56..0ce1211664b5ba 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td
@@ -63,7 +63,7 @@ def XeGPU_TensorDesc: XeGPUTypeDef<"TensorDesc", 
"tensor_desc",
 element-type ::= float-type | integer-type | index-type
 dim-list := (static-dim-list `x`)?
 static-dim-list ::= decimal-literal `x` decimal-literal
-attr-list = (, memory_space = value)? (, arr_len = value)? (, 
boundary_check = value)? (, scattered = value)? (, sg_map `<` wi_layout = 
value, wi_data = value `>`)?
+attr-list = (, memory_space = value)? (, arr_len = value)? (, 
boundary_check = value)? (, scattered = value)?
 ```
 
 Examples:
@@ -77,16 +77,12 @@ def XeGPU_TensorDesc: XeGPUTypeDef<"TensorDesc", 
"tensor_desc",
 
 // A TensorDesc with 8x16 f32 elements for a memory region in shared 
memory space.
 xegpu.tensor_desc<8x16xf32, #xegpu.tdesc_attr>
-
-// A TensorDesc with a sg_map
-xegpu.tensor_desc<8x16xf32, #xegpu.sg_map>
 ```
   }];
 
   let parameters = (ins ArrayRefParameter<"int64_t">: $shape,
 "mlir::Type": $elementType,
-OptionalParameter<"mlir::Attribute">: $encoding,
-OptionalParameter<"mlir::Attribute">: $sg_map);
+OptionalParameter<"mlir::Attribute">: $encoding);
 
   let builders = [
 TypeBuilderWithInferredContext<(ins
@@ -94,14 +90,12 @@ def XeGPU_TensorDesc: XeGPUTypeDef<"TensorDesc", 
"tensor_desc",
   "mlir::Type": $elementType,
   CArg<"int", "1">: $array_length,
   CArg<"bool", "true">: $boundary_check,
-  CArg<"xegpu::MemorySpace", "xegpu::MemorySpace::Global">:$memory_space,
-  CArg<"mlir::Attribute", "mlir::Attribute()">:$sg_map)>,
+  CArg<"xegpu::MemorySpace", "xegpu::MemorySpace::Global">:$memory_space)>,
 TypeBuilderWithInferredContext<(ins
   "llvm::ArrayRef": $shape,
   "mlir::Type": $elementType,
   CArg<"int", "1">: $chunk_size,
-  CArg<"xegpu::MemorySpace", "xegpu::MemorySpace::Global">:$memory_space,
-  CArg<"mlir::Attribute", "mlir::Attribute()">:$sg_map)>
+  CArg<"xegpu::MemorySpace", "x

[llvm-branch-commits] [libcxx] release/19.x: [libc++] Fix name of is_always_lock_free test which was never being run (#106077) (PR #110838)

2024-10-02 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-libcxx

Author: None (llvmbot)


Changes

Backport b45661953e6974782b0ccada6f0784db04bc693f

Requested by: @ldionne

---
Full diff: https://github.com/llvm/llvm-project/pull/110838.diff


1 Files Affected:

- (renamed) 
libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.pass.cpp (+10-3) 


``diff
diff --git a/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp 
b/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.pass.cpp
similarity index 94%
rename from libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp
rename to libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.pass.cpp
index 2dc7f5c7654193..723e7b36f50319 100644
--- a/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp
+++ b/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.pass.cpp
@@ -5,8 +5,9 @@
 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 //
 
//===--===//
-//
+
 // UNSUPPORTED: c++03, c++11, c++14
+// XFAIL: LIBCXX-PICOLIBC-FIXME
 
 // 
 //
@@ -15,6 +16,10 @@
 //
 // static constexpr bool is_always_lock_free;
 
+// Ignore diagnostic about vector types changing the ABI on some targets, since
+// that is irrelevant for this test.
+// ADDITIONAL_COMPILE_FLAGS: -Wno-psabi
+
 #include 
 #include 
 #include 
@@ -26,7 +31,8 @@ template 
 void check_always_lock_free(std::atomic const& a) {
   using InfoT = LockFreeStatusInfo;
 
-  constexpr std::same_as decltype(auto) is_always_lock_free = 
std::atomic::is_always_lock_free;
+  constexpr auto is_always_lock_free = std::atomic::is_always_lock_free;
+  ASSERT_SAME_TYPE(decltype(is_always_lock_free), bool const);
 
   // If we know the status of T for sure, validate the exact result of the 
function.
   if constexpr (InfoT::status_known) {
@@ -44,7 +50,8 @@ void check_always_lock_free(std::atomic const& a) {
 
   // In all cases, also sanity-check it based on the implication 
always-lock-free => lock-free.
   if (is_always_lock_free) {
-std::same_as decltype(auto) is_lock_free = a.is_lock_free();
+auto is_lock_free = a.is_lock_free();
+ASSERT_SAME_TYPE(decltype(is_always_lock_free), bool const);
 assert(is_lock_free);
   }
   ASSERT_NOEXCEPT(a.is_lock_free());

``




https://github.com/llvm/llvm-project/pull/110838
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] release/19.x: [libc++] Fix name of is_always_lock_free test which was never being run (#106077) (PR #110838)

2024-10-02 Thread Louis Dionne via llvm-branch-commits


https://github.com/ldionne approved this pull request.


https://github.com/llvm/llvm-project/pull/110838
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] release/19.x: [libc++] Fix name of is_always_lock_free test which was never being run (#106077) (PR #110838)

2024-10-02 Thread Louis Dionne via llvm-branch-commits


ldionne wrote:

Impact: basically none, this adds test coverage we should always have had.

https://github.com/llvm/llvm-project/pull/110838
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)

2024-10-02 Thread Kito Cheng via llvm-branch-commits



@@ -62,6 +69,8 @@ define i32 @callee_float_in_regs(i32 %a, float %b) {
 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:mv a0, a1
 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:call __fixsfsi
 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:add a0, s0, a0
+; ILP32E-FPELIM-SAVE-RESTORE-NEXT:.cfi_restore ra
+; ILP32E-FPELIM-SAVE-RESTORE-NEXT:.cfi_restore s0
 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:tail __riscv_restore_1

kito-cheng wrote:

Just put a link for the comment I leave in your prev PR to prevent I forgot 
this 
https://github.com/llvm/llvm-project/pull/110234#pullrequestreview-2342046243

https://github.com/llvm/llvm-project/pull/110810
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)

2024-10-02 Thread via llvm-branch-commits



@@ -4386,34 +4386,54 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // it.
   IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType());
 
+  SDNodeFlags ScaleFlags;
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in a signed sense (mul nsw).
+  if (NW.hasNoUnsignedSignedWrap())
+ScaleFlags.setNoSignedWrap(true);
+
+  // The multiplication of an index by the type size does not wrap the
+  // pointer index type in an unsigned sense (mul nuw).
+  if (NW.hasNoUnsignedWrap())
+ScaleFlags.setNoUnsignedWrap(true);

goldsteinn wrote:

`ScaleFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap())`;

Likewise above.

https://github.com/llvm/llvm-project/pull/110815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] workflows/release-binaries: Use static ZSTD on macOS (PR #110701)

2024-10-02 Thread Keith Smiley via llvm-branch-commits


keith wrote:

Thanks!

https://github.com/llvm/llvm-project/pull/110701
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] workflows/release-binaries: Use static ZSTD on macOS (PR #110701)

2024-10-02 Thread Tom Stellard via llvm-branch-commits


tstellar wrote:

@keith I added this to the release milestone, so the release manager (@tru) 
will merge this before the next release.

https://github.com/llvm/llvm-project/pull/110701
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] workflows/release-binaries: Use static ZSTD on macOS (PR #110701)

2024-10-02 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar milestoned 
https://github.com/llvm/llvm-project/pull/110701
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

40 matches

Mail list logo