[llvm-branch-commits] [llvm] [AArch64][PAC] Move emission of LR checks in tail calls to AsmPrinter (PR #110705)
atrosinenko wrote: Updated the commit message: removed the paragraph about dropping one `mov` instruction from the non-trapping variant of check. That was my initial idea to make non-hint `xpac(i|d)` operate on the tested register itself (just like `xpaclri` does), but it was removed from the final version of the patch, to not make unnecessary changes to the tests. https://github.com/llvm/llvm-project/pull/110705 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Custom expand flat cmpxchg which may access private (PR #109410)
@@ -43,7 +43,7 @@ define i64 @test_flat_atomicrmw_sub_0_i64_agent(ptr %ptr) { ; ALL: [[ATOMICRMW_PRIVATE]]: ; ALL-NEXT:[[TMP1:%.*]] = addrspacecast ptr [[PTR]] to ptr addrspace(5) ; ALL-NEXT:[[LOADED_PRIVATE:%.*]] = load i64, ptr addrspace(5) [[TMP1]], align 8 -; ALL-NEXT:[[NEW:%.*]] = sub i64 [[LOADED_PRIVATE]], 0 +; ALL-NEXT:[[NEW:%.*]] = add i64 [[LOADED_PRIVATE]], 0 Pierre-vh wrote: Why does this transform happen more often now? https://github.com/llvm/llvm-project/pull/109410 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)
dlav-sc wrote: @topperc @kito-cheng FYI https://github.com/llvm/llvm-project/pull/110810 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/110815 This allows selecting the addressing mode for stack instructions in cases where we need to prove the sign bit is zero. >From 56474dac206d8592229cb56e1f12b543ec97 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 2 Oct 2024 11:20:23 +0400 Subject: [PATCH] DAG: Preserve more flags when expanding gep This allows selecting the addressing mode for stack instructions in cases where we need to prove the sign bit is zero. --- .../SelectionDAG/SelectionDAGBuilder.cpp | 41 +++ .../CodeGen/AMDGPU/gep-flags-stack-offsets.ll | 6 +-- .../pointer-add-unknown-offset-debug-info.ll | 2 +- 3 files changed, 36 insertions(+), 13 deletions(-) diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 25213f587116d5..6838c0b530a363 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -4386,6 +4386,17 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // it. IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType()); + SDNodeFlags ScaleFlags; + // The multiplication of an index by the type size does not wrap the + // pointer index type in a signed sense (mul nsw). + if (NW.hasNoUnsignedSignedWrap()) +ScaleFlags.setNoSignedWrap(true); + + // The multiplication of an index by the type size does not wrap the + // pointer index type in an unsigned sense (mul nuw). + if (NW.hasNoUnsignedWrap()) +ScaleFlags.setNoUnsignedWrap(true); + if (ElementScalable) { EVT VScaleTy = N.getValueType().getScalarType(); SDValue VScale = DAG.getNode( @@ -4393,27 +4404,41 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy)); if (IsVectorGEP) VScale = DAG.getSplatVector(N.getValueType(), dl, VScale); -IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale, + ScaleFlags); } else { // If this is a multiply by a power of two, turn it into a shl // immediately. This is a very common case. if (ElementMul != 1) { if (ElementMul.isPowerOf2()) { unsigned Amt = ElementMul.logBase2(); -IdxN = DAG.getNode(ISD::SHL, dl, - N.getValueType(), IdxN, - DAG.getConstant(Amt, dl, IdxN.getValueType())); +IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN, + DAG.getConstant(Amt, dl, IdxN.getValueType()), + ScaleFlags); } else { SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl, IdxN.getValueType()); -IdxN = DAG.getNode(ISD::MUL, dl, - N.getValueType(), IdxN, Scale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale, + ScaleFlags); } } } - N = DAG.getNode(ISD::ADD, dl, - N.getValueType(), N, IdxN); + SDNodeFlags AddFlags; + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in a signed sense (add + // nsw). + if (NW.hasNoUnsignedSignedWrap()) +AddFlags.setNoSignedWrap(true); + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in an unsigned sense (add + // nuw). + if (NW.hasNoUnsignedWrap()) +AddFlags.setNoUnsignedWrap(true); + + N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags); } } diff --git a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll index 782894976c711c..a39afa6f609c7e 100644 --- a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll +++ b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll @@ -118,8 +118,7 @@ define void @gep_inbounds_nuw_alloca(i32 %idx, i32 %val) #0 { ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0 -; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0 -; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen +; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen offset:16 ; GFX8-NEXT:s_waitcnt vmcnt(0) ; GFX8-NEXT:s_setpc_b64 s[30:31] ; @@ -145,8 +144,7 @@ define void @gep_nusw_nuw_alloca(i32 %idx, i32 %val) #0 { ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32 ; GFX8-NEXT:v_add_u32_e32 v0, v
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/110815?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#110815** https://app.graphite.dev/github/pr/llvm/llvm-project/110815?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> š * **#110814** https://app.graphite.dev/github/pr/llvm/llvm-project/110814?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/110815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] fix RISCVPushPopOptimizer pass (PR #110813)
dlav-sc wrote: @topperc @michaelmaitland FYI https://github.com/llvm/llvm-project/pull/110813 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
llvmbot wrote: @llvm/pr-subscribers-llvm-selectiondag Author: Matt Arsenault (arsenm) Changes This allows selecting the addressing mode for stack instructions in cases where we need to prove the sign bit is zero. --- Full diff: https://github.com/llvm/llvm-project/pull/110815.diff 3 Files Affected: - (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+33-8) - (modified) llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll (+2-4) - (modified) llvm/test/DebugInfo/Sparc/pointer-add-unknown-offset-debug-info.ll (+1-1) ``diff diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 25213f587116d5..6838c0b530a363 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -4386,6 +4386,17 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // it. IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType()); + SDNodeFlags ScaleFlags; + // The multiplication of an index by the type size does not wrap the + // pointer index type in a signed sense (mul nsw). + if (NW.hasNoUnsignedSignedWrap()) +ScaleFlags.setNoSignedWrap(true); + + // The multiplication of an index by the type size does not wrap the + // pointer index type in an unsigned sense (mul nuw). + if (NW.hasNoUnsignedWrap()) +ScaleFlags.setNoUnsignedWrap(true); + if (ElementScalable) { EVT VScaleTy = N.getValueType().getScalarType(); SDValue VScale = DAG.getNode( @@ -4393,27 +4404,41 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy)); if (IsVectorGEP) VScale = DAG.getSplatVector(N.getValueType(), dl, VScale); -IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale, + ScaleFlags); } else { // If this is a multiply by a power of two, turn it into a shl // immediately. This is a very common case. if (ElementMul != 1) { if (ElementMul.isPowerOf2()) { unsigned Amt = ElementMul.logBase2(); -IdxN = DAG.getNode(ISD::SHL, dl, - N.getValueType(), IdxN, - DAG.getConstant(Amt, dl, IdxN.getValueType())); +IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN, + DAG.getConstant(Amt, dl, IdxN.getValueType()), + ScaleFlags); } else { SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl, IdxN.getValueType()); -IdxN = DAG.getNode(ISD::MUL, dl, - N.getValueType(), IdxN, Scale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale, + ScaleFlags); } } } - N = DAG.getNode(ISD::ADD, dl, - N.getValueType(), N, IdxN); + SDNodeFlags AddFlags; + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in a signed sense (add + // nsw). + if (NW.hasNoUnsignedSignedWrap()) +AddFlags.setNoSignedWrap(true); + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in an unsigned sense (add + // nuw). + if (NW.hasNoUnsignedWrap()) +AddFlags.setNoUnsignedWrap(true); + + N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags); } } diff --git a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll index 782894976c711c..a39afa6f609c7e 100644 --- a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll +++ b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll @@ -118,8 +118,7 @@ define void @gep_inbounds_nuw_alloca(i32 %idx, i32 %val) #0 { ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0 -; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0 -; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen +; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen offset:16 ; GFX8-NEXT:s_waitcnt vmcnt(0) ; GFX8-NEXT:s_setpc_b64 s[30:31] ; @@ -145,8 +144,7 @@ define void @gep_nusw_nuw_alloca(i32 %idx, i32 %val) #0 { ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0 -; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0 -; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen +; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen offset:16 ;
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/110815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // it. IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType()); + SDNodeFlags ScaleFlags; + // The multiplication of an index by the type size does not wrap the + // pointer index type in a signed sense (mul nsw). + if (NW.hasNoUnsignedSignedWrap()) +ScaleFlags.setNoSignedWrap(true); + + // The multiplication of an index by the type size does not wrap the + // pointer index type in an unsigned sense (mul nuw). + if (NW.hasNoUnsignedWrap()) +ScaleFlags.setNoUnsignedWrap(true); + if (ElementScalable) { EVT VScaleTy = N.getValueType().getScalarType(); SDValue VScale = DAG.getNode( ISD::VSCALE, dl, VScaleTy, DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy)); if (IsVectorGEP) VScale = DAG.getSplatVector(N.getValueType(), dl, VScale); -IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale, + ScaleFlags); } else { // If this is a multiply by a power of two, turn it into a shl // immediately. This is a very common case. if (ElementMul != 1) { if (ElementMul.isPowerOf2()) { unsigned Amt = ElementMul.logBase2(); -IdxN = DAG.getNode(ISD::SHL, dl, - N.getValueType(), IdxN, - DAG.getConstant(Amt, dl, IdxN.getValueType())); +IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN, + DAG.getConstant(Amt, dl, IdxN.getValueType()), + ScaleFlags); } else { SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl, IdxN.getValueType()); -IdxN = DAG.getNode(ISD::MUL, dl, - N.getValueType(), IdxN, Scale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale, + ScaleFlags); } } } - N = DAG.getNode(ISD::ADD, dl, - N.getValueType(), N, IdxN); + SDNodeFlags AddFlags; + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in a signed sense (add + // nsw). + if (NW.hasNoUnsignedSignedWrap()) +AddFlags.setNoSignedWrap(true); nikic wrote: Adjust tests to have explicit nuw flag rather than only inbounds? https://github.com/llvm/llvm-project/pull/110815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // it. IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType()); + SDNodeFlags ScaleFlags; + // The multiplication of an index by the type size does not wrap the + // pointer index type in a signed sense (mul nsw). + if (NW.hasNoUnsignedSignedWrap()) +ScaleFlags.setNoSignedWrap(true); + + // The multiplication of an index by the type size does not wrap the + // pointer index type in an unsigned sense (mul nuw). + if (NW.hasNoUnsignedWrap()) +ScaleFlags.setNoUnsignedWrap(true); + if (ElementScalable) { EVT VScaleTy = N.getValueType().getScalarType(); SDValue VScale = DAG.getNode( ISD::VSCALE, dl, VScaleTy, DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy)); if (IsVectorGEP) VScale = DAG.getSplatVector(N.getValueType(), dl, VScale); -IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale, + ScaleFlags); } else { // If this is a multiply by a power of two, turn it into a shl // immediately. This is a very common case. if (ElementMul != 1) { if (ElementMul.isPowerOf2()) { unsigned Amt = ElementMul.logBase2(); -IdxN = DAG.getNode(ISD::SHL, dl, - N.getValueType(), IdxN, - DAG.getConstant(Amt, dl, IdxN.getValueType())); +IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN, + DAG.getConstant(Amt, dl, IdxN.getValueType()), + ScaleFlags); } else { SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl, IdxN.getValueType()); -IdxN = DAG.getNode(ISD::MUL, dl, - N.getValueType(), IdxN, Scale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale, + ScaleFlags); } } } - N = DAG.getNode(ISD::ADD, dl, - N.getValueType(), N, IdxN); + SDNodeFlags AddFlags; + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in a signed sense (add + // nsw). + if (NW.hasNoUnsignedSignedWrap()) +AddFlags.setNoSignedWrap(true); arsenm wrote: That's already tested here: https://github.com/llvm/llvm-project/blob/56474dac206d8592229cb56e1f12b543ec97/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll#L134 But it's still not enough. computeKnownBits still can't prove the sign bit is zero during selection with all flags on both GEPs: ``` define void @gep_all_flags(i32 %idx, i32 %val) { %alloca = alloca [32 x i32], align 4, addrspace(5) %gep0 = getelementptr inbounds nuw [32 x i32], ptr addrspace(5) %alloca, i32 0, i32 %idx %gep1 = getelementptr inbounds nuw i8, ptr addrspace(5) %gep0, i32 16 store volatile i32 %val, ptr addrspace(5) %gep1, align 4 ret void } ``` ``` Optimized legalized selection DAG: %bb.0 'gep_all_flags:' SelectionDAG has 14 nodes: t0: ch,glue = EntryToken t4: i32,ch = CopyFromReg # D:1 t0, Register:i32 %8 t2: i32,ch = CopyFromReg # D:1 t0, Register:i32 %7 t7: i32 = shl nuw nsw # D:1 t2, Constant:i32<2> t8: i32 = add nuw # D:1 FrameIndex:i32<0>, t7 t10: i32 = add nuw # D:1 t8, Constant:i32<16> t13: ch = store<(volatile store (s32) into %ir.gep1, addrspace 5)> # D:1 t0, t4, t10, undef:i32 t14: ch = RET_GLUE t13 ``` https://github.com/llvm/llvm-project/pull/110815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Custom expand flat cmpxchg which may access private (PR #109410)
@@ -43,7 +43,7 @@ define i64 @test_flat_atomicrmw_sub_0_i64_agent(ptr %ptr) { ; ALL: [[ATOMICRMW_PRIVATE]]: ; ALL-NEXT:[[TMP1:%.*]] = addrspacecast ptr [[PTR]] to ptr addrspace(5) ; ALL-NEXT:[[LOADED_PRIVATE:%.*]] = load i64, ptr addrspace(5) [[TMP1]], align 8 -; ALL-NEXT:[[NEW:%.*]] = sub i64 [[LOADED_PRIVATE]], 0 +; ALL-NEXT:[[NEW:%.*]] = add i64 [[LOADED_PRIVATE]], 0 arsenm wrote: Because it would require more work to avoid doing it, but there's not much reason to. All of the 64-bit cases now go to expand. emitExpandAtomicRMW isn't bothering to restrict this to the specific cases where it's needed https://github.com/llvm/llvm-project/pull/109410 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] fix RISCVPushPopOptimizer pass (PR #110813)
llvmbot wrote: @llvm/pr-subscribers-backend-risc-v Author: None (dlav-sc) Changes RISCVPushPopOptimizer pass didn't suggest that CFI instructions can be inserted between cm.pop and ret and couldn't apply optimization in these cases. The patch fix it and allows the pass to remove CFI instructions and combine cm.pop and ret into the one instruction: cm.popret or cm.popretz. --- Patch is 33.43 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110813.diff 7 Files Affected: - (modified) llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp (+22-4) - (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+8-24) - (modified) llvm/test/CodeGen/RISCV/cm_mvas_mvsa.ll (+4-8) - (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+44-180) - (modified) llvm/test/CodeGen/RISCV/zcmp-additional-stack.ll (+1-6) - (modified) llvm/test/CodeGen/RISCV/zcmp-cm-popretz.mir (+4-22) - (modified) llvm/test/CodeGen/RISCV/zcmp-with-float.ll (+2-10) ``diff diff --git a/llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp b/llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp index 098e5bb5328bb3..2b1d6c25891a1f 100644 --- a/llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp +++ b/llvm/lib/Target/RISCV/RISCVPushPopOptimizer.cpp @@ -45,10 +45,25 @@ char RISCVPushPopOpt::ID = 0; INITIALIZE_PASS(RISCVPushPopOpt, "riscv-push-pop-opt", RISCV_PUSH_POP_OPT_NAME, false, false) +template +static IterT nextNoDebugNoCFIInst(const IterT It, const IterT End) { + return std::find_if_not(std::next(It), End, [](const auto &Inst) { +return Inst.isDebugInstr() || Inst.isCFIInstruction() || + Inst.isPseudoProbe(); + }); +} + +static void eraseCFIInst(const MachineBasicBlock::iterator Begin, + const MachineBasicBlock::iterator End) { + for (auto &Inst : llvm::make_early_inc_range(llvm::make_range(Begin, End))) +if (Inst.isCFIInstruction()) + Inst.eraseFromParent(); +} + // Check if POP instruction was inserted into the MBB and return iterator to it. static MachineBasicBlock::iterator containsPop(MachineBasicBlock &MBB) { for (MachineBasicBlock::iterator MBBI = MBB.begin(); MBBI != MBB.end(); - MBBI = next_nodbg(MBBI, MBB.end())) + MBBI = nextNoDebugNoCFIInst(MBBI, MBB.end())) if (MBBI->getOpcode() == RISCV::CM_POP) return MBBI; @@ -76,6 +91,9 @@ bool RISCVPushPopOpt::usePopRet(MachineBasicBlock::iterator &MBBI, for (unsigned i = FirstNonDeclaredOp; i < MBBI->getNumOperands(); ++i) PopRetBuilder.add(MBBI->getOperand(i)); + // Remove CFI instructions, they are not needed for cm.popret and cm.popretz + eraseCFIInst(MBBI, NextI); + MBBI->eraseFromParent(); NextI->eraseFromParent(); return true; @@ -92,8 +110,8 @@ bool RISCVPushPopOpt::adjustRetVal(MachineBasicBlock::iterator &MBBI) { // Since POP instruction is in Epilogue no normal instructions will follow // after it. Therefore search only previous ones to find the return value. for (MachineBasicBlock::reverse_iterator I = - next_nodbg(MBBI.getReverse(), RE); - I != RE; I = next_nodbg(I, RE)) { + nextNoDebugNoCFIInst(MBBI.getReverse(), RE); + I != RE; I = nextNoDebugNoCFIInst(I, RE)) { MachineInstr &MI = *I; if (auto OperandPair = TII->isCopyInstrImpl(MI)) { Register DestReg = OperandPair->Destination->getReg(); @@ -138,7 +156,7 @@ bool RISCVPushPopOpt::runOnMachineFunction(MachineFunction &Fn) { bool Modified = false; for (auto &MBB : Fn) { MachineBasicBlock::iterator MBBI = containsPop(MBB); -MachineBasicBlock::iterator NextI = next_nodbg(MBBI, MBB.end()); +MachineBasicBlock::iterator NextI = nextNoDebugNoCFIInst(MBBI, MBB.end()); if (MBBI != MBB.end() && NextI != MBB.end() && NextI->getOpcode() == RISCV::PseudoRET) Modified |= usePopRet(MBBI, NextI, adjustRetVal(MBBI)); diff --git a/llvm/test/CodeGen/RISCV/callee-saved-gprs.ll b/llvm/test/CodeGen/RISCV/callee-saved-gprs.ll index 2a26602de9e1e7..528e52bad8ef1a 100644 --- a/llvm/test/CodeGen/RISCV/callee-saved-gprs.ll +++ b/llvm/test/CodeGen/RISCV/callee-saved-gprs.ll @@ -432,8 +432,7 @@ define void @callee() nounwind { ; RV32IZCMP-NEXT:sw a0, %lo(var+4)(t0) ; RV32IZCMP-NEXT:lw a0, 28(sp) # 4-byte Folded Reload ; RV32IZCMP-NEXT:sw a0, %lo(var)(t0) -; RV32IZCMP-NEXT:cm.pop {ra, s0-s11}, 96 -; RV32IZCMP-NEXT:ret +; RV32IZCMP-NEXT:cm.popret {ra, s0-s11}, 96 ; ; RV32IZCMP-WITH-FP-LABEL: callee: ; RV32IZCMP-WITH-FP: # %bb.0: @@ -942,8 +941,7 @@ define void @callee() nounwind { ; RV64IZCMP-NEXT:sw a0, %lo(var+4)(t0) ; RV64IZCMP-NEXT:ld a0, 40(sp) # 8-byte Folded Reload ; RV64IZCMP-NEXT:sw a0, %lo(var)(t0) -; RV64IZCMP-NEXT:cm.pop {ra, s0-s11}, 160 -; RV64IZCMP-NEXT:ret +; RV64IZCMP-NEXT:cm.popret {ra, s0-s11}, 160 ; ; RV64IZCMP-WITH-FP-LABEL: callee: ; RV64IZCMP-WITH-FP: # %bb.0: @@ -1613,8 +1611,7 @@ define
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // it. IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType()); + SDNodeFlags ScaleFlags; + // The multiplication of an index by the type size does not wrap the + // pointer index type in a signed sense (mul nsw). + if (NW.hasNoUnsignedSignedWrap()) +ScaleFlags.setNoSignedWrap(true); + + // The multiplication of an index by the type size does not wrap the + // pointer index type in an unsigned sense (mul nuw). + if (NW.hasNoUnsignedWrap()) +ScaleFlags.setNoUnsignedWrap(true); + if (ElementScalable) { EVT VScaleTy = N.getValueType().getScalarType(); SDValue VScale = DAG.getNode( ISD::VSCALE, dl, VScaleTy, DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy)); if (IsVectorGEP) VScale = DAG.getSplatVector(N.getValueType(), dl, VScale); -IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale, + ScaleFlags); } else { // If this is a multiply by a power of two, turn it into a shl // immediately. This is a very common case. if (ElementMul != 1) { if (ElementMul.isPowerOf2()) { unsigned Amt = ElementMul.logBase2(); -IdxN = DAG.getNode(ISD::SHL, dl, - N.getValueType(), IdxN, - DAG.getConstant(Amt, dl, IdxN.getValueType())); +IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN, + DAG.getConstant(Amt, dl, IdxN.getValueType()), + ScaleFlags); } else { SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl, IdxN.getValueType()); -IdxN = DAG.getNode(ISD::MUL, dl, - N.getValueType(), IdxN, Scale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale, + ScaleFlags); } } } - N = DAG.getNode(ISD::ADD, dl, - N.getValueType(), N, IdxN); + SDNodeFlags AddFlags; + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in a signed sense (add + // nsw). + if (NW.hasNoUnsignedSignedWrap()) +AddFlags.setNoSignedWrap(true); + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in an unsigned sense (add + // nuw). nikic wrote: The more relevant wording here is: > The successive addition of the current address, truncated to the pointer > index type and interpreted as an unsigned number, and each offset, also > interpreted as an unsigned number, does not wrap the pointer index type (add > nuw). https://github.com/llvm/llvm-project/pull/110815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // it. IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType()); + SDNodeFlags ScaleFlags; + // The multiplication of an index by the type size does not wrap the + // pointer index type in a signed sense (mul nsw). + if (NW.hasNoUnsignedSignedWrap()) +ScaleFlags.setNoSignedWrap(true); + + // The multiplication of an index by the type size does not wrap the + // pointer index type in an unsigned sense (mul nuw). + if (NW.hasNoUnsignedWrap()) +ScaleFlags.setNoUnsignedWrap(true); + if (ElementScalable) { EVT VScaleTy = N.getValueType().getScalarType(); SDValue VScale = DAG.getNode( ISD::VSCALE, dl, VScaleTy, DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy)); if (IsVectorGEP) VScale = DAG.getSplatVector(N.getValueType(), dl, VScale); -IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale, + ScaleFlags); } else { // If this is a multiply by a power of two, turn it into a shl // immediately. This is a very common case. if (ElementMul != 1) { if (ElementMul.isPowerOf2()) { unsigned Amt = ElementMul.logBase2(); -IdxN = DAG.getNode(ISD::SHL, dl, - N.getValueType(), IdxN, - DAG.getConstant(Amt, dl, IdxN.getValueType())); +IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN, + DAG.getConstant(Amt, dl, IdxN.getValueType()), + ScaleFlags); } else { SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl, IdxN.getValueType()); -IdxN = DAG.getNode(ISD::MUL, dl, - N.getValueType(), IdxN, Scale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale, + ScaleFlags); } } } - N = DAG.getNode(ISD::ADD, dl, - N.getValueType(), N, IdxN); + SDNodeFlags AddFlags; + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in a signed sense (add + // nsw). + if (NW.hasNoUnsignedSignedWrap()) +AddFlags.setNoSignedWrap(true); nikic wrote: This one looks incorrect. The add below is adding to the pointer (N), not just accumulating offsets, so you can't use nsw here. (The nuw below is correct.) https://github.com/llvm/llvm-project/pull/110815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/110815 >From 56474dac206d8592229cb56e1f12b543ec97 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 2 Oct 2024 11:20:23 +0400 Subject: [PATCH 1/2] DAG: Preserve more flags when expanding gep This allows selecting the addressing mode for stack instructions in cases where we need to prove the sign bit is zero. --- .../SelectionDAG/SelectionDAGBuilder.cpp | 41 +++ .../CodeGen/AMDGPU/gep-flags-stack-offsets.ll | 6 +-- .../pointer-add-unknown-offset-debug-info.ll | 2 +- 3 files changed, 36 insertions(+), 13 deletions(-) diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 25213f587116d5..6838c0b530a363 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -4386,6 +4386,17 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // it. IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType()); + SDNodeFlags ScaleFlags; + // The multiplication of an index by the type size does not wrap the + // pointer index type in a signed sense (mul nsw). + if (NW.hasNoUnsignedSignedWrap()) +ScaleFlags.setNoSignedWrap(true); + + // The multiplication of an index by the type size does not wrap the + // pointer index type in an unsigned sense (mul nuw). + if (NW.hasNoUnsignedWrap()) +ScaleFlags.setNoUnsignedWrap(true); + if (ElementScalable) { EVT VScaleTy = N.getValueType().getScalarType(); SDValue VScale = DAG.getNode( @@ -4393,27 +4404,41 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy)); if (IsVectorGEP) VScale = DAG.getSplatVector(N.getValueType(), dl, VScale); -IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale, + ScaleFlags); } else { // If this is a multiply by a power of two, turn it into a shl // immediately. This is a very common case. if (ElementMul != 1) { if (ElementMul.isPowerOf2()) { unsigned Amt = ElementMul.logBase2(); -IdxN = DAG.getNode(ISD::SHL, dl, - N.getValueType(), IdxN, - DAG.getConstant(Amt, dl, IdxN.getValueType())); +IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN, + DAG.getConstant(Amt, dl, IdxN.getValueType()), + ScaleFlags); } else { SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl, IdxN.getValueType()); -IdxN = DAG.getNode(ISD::MUL, dl, - N.getValueType(), IdxN, Scale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale, + ScaleFlags); } } } - N = DAG.getNode(ISD::ADD, dl, - N.getValueType(), N, IdxN); + SDNodeFlags AddFlags; + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in a signed sense (add + // nsw). + if (NW.hasNoUnsignedSignedWrap()) +AddFlags.setNoSignedWrap(true); + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in an unsigned sense (add + // nuw). + if (NW.hasNoUnsignedWrap()) +AddFlags.setNoUnsignedWrap(true); + + N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags); } } diff --git a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll index 782894976c711c..a39afa6f609c7e 100644 --- a/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll +++ b/llvm/test/CodeGen/AMDGPU/gep-flags-stack-offsets.ll @@ -118,8 +118,7 @@ define void @gep_inbounds_nuw_alloca(i32 %idx, i32 %val) #0 { ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0 -; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0 -; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen +; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offen offset:16 ; GFX8-NEXT:s_waitcnt vmcnt(0) ; GFX8-NEXT:s_setpc_b64 s[30:31] ; @@ -145,8 +144,7 @@ define void @gep_nusw_nuw_alloca(i32 %idx, i32 %val) #0 { ; GFX8-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; GFX8-NEXT:v_lshrrev_b32_e64 v2, 6, s32 ; GFX8-NEXT:v_add_u32_e32 v0, vcc, v2, v0 -; GFX8-NEXT:v_add_u32_e32 v0, vcc, 16, v0 -; GFX8-NEXT:buffer_store_dword v1, v0, s[0:3], 0 offe
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
@@ -4386,34 +4386,59 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // it. IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType()); + SDNodeFlags ScaleFlags; + // The multiplication of an index by the type size does not wrap the + // pointer index type in a signed sense (mul nsw). + if (NW.hasNoUnsignedSignedWrap()) +ScaleFlags.setNoSignedWrap(true); + + // The multiplication of an index by the type size does not wrap the + // pointer index type in an unsigned sense (mul nuw). + if (NW.hasNoUnsignedWrap()) +ScaleFlags.setNoUnsignedWrap(true); + if (ElementScalable) { EVT VScaleTy = N.getValueType().getScalarType(); SDValue VScale = DAG.getNode( ISD::VSCALE, dl, VScaleTy, DAG.getConstant(ElementMul.getZExtValue(), dl, VScaleTy)); if (IsVectorGEP) VScale = DAG.getSplatVector(N.getValueType(), dl, VScale); -IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, VScale, + ScaleFlags); } else { // If this is a multiply by a power of two, turn it into a shl // immediately. This is a very common case. if (ElementMul != 1) { if (ElementMul.isPowerOf2()) { unsigned Amt = ElementMul.logBase2(); -IdxN = DAG.getNode(ISD::SHL, dl, - N.getValueType(), IdxN, - DAG.getConstant(Amt, dl, IdxN.getValueType())); +IdxN = DAG.getNode(ISD::SHL, dl, N.getValueType(), IdxN, + DAG.getConstant(Amt, dl, IdxN.getValueType()), + ScaleFlags); } else { SDValue Scale = DAG.getConstant(ElementMul.getZExtValue(), dl, IdxN.getValueType()); -IdxN = DAG.getNode(ISD::MUL, dl, - N.getValueType(), IdxN, Scale); +IdxN = DAG.getNode(ISD::MUL, dl, N.getValueType(), IdxN, Scale, + ScaleFlags); } } } - N = DAG.getNode(ISD::ADD, dl, - N.getValueType(), N, IdxN); + SDNodeFlags AddFlags; + + // The successive addition of each offset (without adding the base + // address) does not wrap the pointer index type in a signed sense (add + // nsw). + if (NW.hasNoUnsignedSignedWrap()) +AddFlags.setNoSignedWrap(true); arsenm wrote: Loses the test benefit though https://github.com/llvm/llvm-project/pull/110815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Custom expand flat cmpxchg which may access private (PR #109410)
https://github.com/Pierre-vh approved this pull request. https://github.com/llvm/llvm-project/pull/109410 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)
https://github.com/arsenm milestoned https://github.com/llvm/llvm-project/pull/110827 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/110827 This was using the default address space instead of the correct one. Fixes #56055 Keep old method around for ABI compatibility on the release branch. (cherry picked from commit 81ba95cefe1b5a12f0a7d8e6a383bcce9e95b785) >From 53bc67c4a690ffdf7445d3d52af03d434f9fd52b Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Mon, 30 Sep 2024 13:43:53 +0400 Subject: [PATCH] FastISel: Fix incorrectly using getPointerTy (#110465) This was using the default address space instead of the correct one. Fixes #56055 Keep old method around for ABI compatibility on the release branch. (cherry picked from commit 81ba95cefe1b5a12f0a7d8e6a383bcce9e95b785) --- llvm/include/llvm/CodeGen/FastISel.h | 7 +- llvm/lib/CodeGen/SelectionDAG/FastISel.cpp | 8 +-- llvm/lib/Target/X86/X86FastISel.cpp| 4 +- llvm/test/CodeGen/X86/issue56055.ll| 81 ++ 4 files changed, 94 insertions(+), 6 deletions(-) create mode 100644 llvm/test/CodeGen/X86/issue56055.ll diff --git a/llvm/include/llvm/CodeGen/FastISel.h b/llvm/include/llvm/CodeGen/FastISel.h index 3cbc35400181dd..f91bd692accad8 100644 --- a/llvm/include/llvm/CodeGen/FastISel.h +++ b/llvm/include/llvm/CodeGen/FastISel.h @@ -275,7 +275,12 @@ class FastISel { /// This is a wrapper around getRegForValue that also takes care of /// truncating or sign-extending the given getelementptr index value. - Register getRegForGEPIndex(const Value *Idx); + Register getRegForGEPIndex(MVT PtrVT, const Value *Idx); + + /// Retained for ABI compatibility in release branch. + Register getRegForGEPIndex(const Value *Idx) { +return getRegForGEPIndex(TLI.getPointerTy(DL), Idx); + } /// We're checking to see if we can fold \p LI into \p FoldInst. Note /// that we could have a sequence where multiple LLVM IR instructions are diff --git a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp index ef9f7833551905..246acc7f405837 100644 --- a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp @@ -380,14 +380,13 @@ void FastISel::updateValueMap(const Value *I, Register Reg, unsigned NumRegs) { } } -Register FastISel::getRegForGEPIndex(const Value *Idx) { +Register FastISel::getRegForGEPIndex(MVT PtrVT, const Value *Idx) { Register IdxN = getRegForValue(Idx); if (!IdxN) // Unhandled operand. Halt "fast" selection and bail. return Register(); // If the index is smaller or larger than intptr_t, truncate or extend it. - MVT PtrVT = TLI.getPointerTy(DL); EVT IdxVT = EVT::getEVT(Idx->getType(), /*HandleUnknown=*/false); if (IdxVT.bitsLT(PtrVT)) { IdxN = fastEmit_r(IdxVT.getSimpleVT(), PtrVT, ISD::SIGN_EXTEND, IdxN); @@ -543,7 +542,8 @@ bool FastISel::selectGetElementPtr(const User *I) { uint64_t TotalOffs = 0; // FIXME: What's a good SWAG number for MaxOffs? uint64_t MaxOffs = 2048; - MVT VT = TLI.getPointerTy(DL); + MVT VT = TLI.getValueType(DL, I->getType()).getSimpleVT(); + for (gep_type_iterator GTI = gep_type_begin(I), E = gep_type_end(I); GTI != E; ++GTI) { const Value *Idx = GTI.getOperand(); @@ -584,7 +584,7 @@ bool FastISel::selectGetElementPtr(const User *I) { // N = N + Idx * ElementSize; uint64_t ElementSize = GTI.getSequentialElementStride(DL); - Register IdxN = getRegForGEPIndex(Idx); + Register IdxN = getRegForGEPIndex(VT, Idx); if (!IdxN) // Unhandled operand. Halt "fast" selection and bail. return false; diff --git a/llvm/lib/Target/X86/X86FastISel.cpp b/llvm/lib/Target/X86/X86FastISel.cpp index 2eae155956368f..5d594bd54fbfc4 100644 --- a/llvm/lib/Target/X86/X86FastISel.cpp +++ b/llvm/lib/Target/X86/X86FastISel.cpp @@ -902,6 +902,8 @@ bool X86FastISel::X86SelectAddress(const Value *V, X86AddressMode &AM) { uint64_t Disp = (int32_t)AM.Disp; unsigned IndexReg = AM.IndexReg; unsigned Scale = AM.Scale; +MVT PtrVT = TLI.getValueType(DL, U->getType()).getSimpleVT(); + gep_type_iterator GTI = gep_type_begin(U); // Iterate through the indices, folding what we can. Constants can be // folded, and one dynamic index can be handled, if the scale is supported. @@ -937,7 +939,7 @@ bool X86FastISel::X86SelectAddress(const Value *V, X86AddressMode &AM) { (S == 1 || S == 2 || S == 4 || S == 8)) { // Scaled-index addressing. Scale = S; - IndexReg = getRegForGEPIndex(Op); + IndexReg = getRegForGEPIndex(PtrVT, Op); if (IndexReg == 0) return false; break; diff --git a/llvm/test/CodeGen/X86/issue56055.ll b/llvm/test/CodeGen/X86/issue56055.ll new file mode 100644 index 00..27eaf13e3b00be --- /dev/null +++ b/llvm/test/CodeGen/X86/issue56055.ll @@ -0,0 +1,81 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS:
[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/110827 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110490)
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/110490 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110490)
arsenm wrote: New version in #110827 https://github.com/llvm/llvm-project/pull/110490 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)
llvmbot wrote: @llvm/pr-subscribers-llvm-selectiondag Author: Matt Arsenault (arsenm) Changes This was using the default address space instead of the correct one. Fixes #56055 Keep old method around for ABI compatibility on the release branch. (cherry picked from commit 81ba95cefe1b5a12f0a7d8e6a383bcce9e95b785) --- Full diff: https://github.com/llvm/llvm-project/pull/110827.diff 4 Files Affected: - (modified) llvm/include/llvm/CodeGen/FastISel.h (+6-1) - (modified) llvm/lib/CodeGen/SelectionDAG/FastISel.cpp (+4-4) - (modified) llvm/lib/Target/X86/X86FastISel.cpp (+3-1) - (added) llvm/test/CodeGen/X86/issue56055.ll (+81) ``diff diff --git a/llvm/include/llvm/CodeGen/FastISel.h b/llvm/include/llvm/CodeGen/FastISel.h index 3cbc35400181dd..f91bd692accad8 100644 --- a/llvm/include/llvm/CodeGen/FastISel.h +++ b/llvm/include/llvm/CodeGen/FastISel.h @@ -275,7 +275,12 @@ class FastISel { /// This is a wrapper around getRegForValue that also takes care of /// truncating or sign-extending the given getelementptr index value. - Register getRegForGEPIndex(const Value *Idx); + Register getRegForGEPIndex(MVT PtrVT, const Value *Idx); + + /// Retained for ABI compatibility in release branch. + Register getRegForGEPIndex(const Value *Idx) { +return getRegForGEPIndex(TLI.getPointerTy(DL), Idx); + } /// We're checking to see if we can fold \p LI into \p FoldInst. Note /// that we could have a sequence where multiple LLVM IR instructions are diff --git a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp index ef9f7833551905..246acc7f405837 100644 --- a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp @@ -380,14 +380,13 @@ void FastISel::updateValueMap(const Value *I, Register Reg, unsigned NumRegs) { } } -Register FastISel::getRegForGEPIndex(const Value *Idx) { +Register FastISel::getRegForGEPIndex(MVT PtrVT, const Value *Idx) { Register IdxN = getRegForValue(Idx); if (!IdxN) // Unhandled operand. Halt "fast" selection and bail. return Register(); // If the index is smaller or larger than intptr_t, truncate or extend it. - MVT PtrVT = TLI.getPointerTy(DL); EVT IdxVT = EVT::getEVT(Idx->getType(), /*HandleUnknown=*/false); if (IdxVT.bitsLT(PtrVT)) { IdxN = fastEmit_r(IdxVT.getSimpleVT(), PtrVT, ISD::SIGN_EXTEND, IdxN); @@ -543,7 +542,8 @@ bool FastISel::selectGetElementPtr(const User *I) { uint64_t TotalOffs = 0; // FIXME: What's a good SWAG number for MaxOffs? uint64_t MaxOffs = 2048; - MVT VT = TLI.getPointerTy(DL); + MVT VT = TLI.getValueType(DL, I->getType()).getSimpleVT(); + for (gep_type_iterator GTI = gep_type_begin(I), E = gep_type_end(I); GTI != E; ++GTI) { const Value *Idx = GTI.getOperand(); @@ -584,7 +584,7 @@ bool FastISel::selectGetElementPtr(const User *I) { // N = N + Idx * ElementSize; uint64_t ElementSize = GTI.getSequentialElementStride(DL); - Register IdxN = getRegForGEPIndex(Idx); + Register IdxN = getRegForGEPIndex(VT, Idx); if (!IdxN) // Unhandled operand. Halt "fast" selection and bail. return false; diff --git a/llvm/lib/Target/X86/X86FastISel.cpp b/llvm/lib/Target/X86/X86FastISel.cpp index 2eae155956368f..5d594bd54fbfc4 100644 --- a/llvm/lib/Target/X86/X86FastISel.cpp +++ b/llvm/lib/Target/X86/X86FastISel.cpp @@ -902,6 +902,8 @@ bool X86FastISel::X86SelectAddress(const Value *V, X86AddressMode &AM) { uint64_t Disp = (int32_t)AM.Disp; unsigned IndexReg = AM.IndexReg; unsigned Scale = AM.Scale; +MVT PtrVT = TLI.getValueType(DL, U->getType()).getSimpleVT(); + gep_type_iterator GTI = gep_type_begin(U); // Iterate through the indices, folding what we can. Constants can be // folded, and one dynamic index can be handled, if the scale is supported. @@ -937,7 +939,7 @@ bool X86FastISel::X86SelectAddress(const Value *V, X86AddressMode &AM) { (S == 1 || S == 2 || S == 4 || S == 8)) { // Scaled-index addressing. Scale = S; - IndexReg = getRegForGEPIndex(Op); + IndexReg = getRegForGEPIndex(PtrVT, Op); if (IndexReg == 0) return false; break; diff --git a/llvm/test/CodeGen/X86/issue56055.ll b/llvm/test/CodeGen/X86/issue56055.ll new file mode 100644 index 00..27eaf13e3b00be --- /dev/null +++ b/llvm/test/CodeGen/X86/issue56055.ll @@ -0,0 +1,81 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc -fast-isel < %s | FileCheck -check-prefixes=CHECK,FASTISEL %s +; RUN: llc < %s | FileCheck -check-prefixes=CHECK,SDAG %s + +target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-windows-msvc" + +define void @issue56055(ptr addrspace(270) %ptr, ptr %out) { +; CHECK-LABEL: issue56055: +; CHECK: # %bb.0: +; CH
[llvm-branch-commits] [llvm] FastISel: Fix incorrectly using getPointerTy (#110465) (PR #110827)
llvmbot wrote: @llvm/pr-subscribers-backend-x86 Author: Matt Arsenault (arsenm) Changes This was using the default address space instead of the correct one. Fixes #56055 Keep old method around for ABI compatibility on the release branch. (cherry picked from commit 81ba95cefe1b5a12f0a7d8e6a383bcce9e95b785) --- Full diff: https://github.com/llvm/llvm-project/pull/110827.diff 4 Files Affected: - (modified) llvm/include/llvm/CodeGen/FastISel.h (+6-1) - (modified) llvm/lib/CodeGen/SelectionDAG/FastISel.cpp (+4-4) - (modified) llvm/lib/Target/X86/X86FastISel.cpp (+3-1) - (added) llvm/test/CodeGen/X86/issue56055.ll (+81) ``diff diff --git a/llvm/include/llvm/CodeGen/FastISel.h b/llvm/include/llvm/CodeGen/FastISel.h index 3cbc35400181dd..f91bd692accad8 100644 --- a/llvm/include/llvm/CodeGen/FastISel.h +++ b/llvm/include/llvm/CodeGen/FastISel.h @@ -275,7 +275,12 @@ class FastISel { /// This is a wrapper around getRegForValue that also takes care of /// truncating or sign-extending the given getelementptr index value. - Register getRegForGEPIndex(const Value *Idx); + Register getRegForGEPIndex(MVT PtrVT, const Value *Idx); + + /// Retained for ABI compatibility in release branch. + Register getRegForGEPIndex(const Value *Idx) { +return getRegForGEPIndex(TLI.getPointerTy(DL), Idx); + } /// We're checking to see if we can fold \p LI into \p FoldInst. Note /// that we could have a sequence where multiple LLVM IR instructions are diff --git a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp index ef9f7833551905..246acc7f405837 100644 --- a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp @@ -380,14 +380,13 @@ void FastISel::updateValueMap(const Value *I, Register Reg, unsigned NumRegs) { } } -Register FastISel::getRegForGEPIndex(const Value *Idx) { +Register FastISel::getRegForGEPIndex(MVT PtrVT, const Value *Idx) { Register IdxN = getRegForValue(Idx); if (!IdxN) // Unhandled operand. Halt "fast" selection and bail. return Register(); // If the index is smaller or larger than intptr_t, truncate or extend it. - MVT PtrVT = TLI.getPointerTy(DL); EVT IdxVT = EVT::getEVT(Idx->getType(), /*HandleUnknown=*/false); if (IdxVT.bitsLT(PtrVT)) { IdxN = fastEmit_r(IdxVT.getSimpleVT(), PtrVT, ISD::SIGN_EXTEND, IdxN); @@ -543,7 +542,8 @@ bool FastISel::selectGetElementPtr(const User *I) { uint64_t TotalOffs = 0; // FIXME: What's a good SWAG number for MaxOffs? uint64_t MaxOffs = 2048; - MVT VT = TLI.getPointerTy(DL); + MVT VT = TLI.getValueType(DL, I->getType()).getSimpleVT(); + for (gep_type_iterator GTI = gep_type_begin(I), E = gep_type_end(I); GTI != E; ++GTI) { const Value *Idx = GTI.getOperand(); @@ -584,7 +584,7 @@ bool FastISel::selectGetElementPtr(const User *I) { // N = N + Idx * ElementSize; uint64_t ElementSize = GTI.getSequentialElementStride(DL); - Register IdxN = getRegForGEPIndex(Idx); + Register IdxN = getRegForGEPIndex(VT, Idx); if (!IdxN) // Unhandled operand. Halt "fast" selection and bail. return false; diff --git a/llvm/lib/Target/X86/X86FastISel.cpp b/llvm/lib/Target/X86/X86FastISel.cpp index 2eae155956368f..5d594bd54fbfc4 100644 --- a/llvm/lib/Target/X86/X86FastISel.cpp +++ b/llvm/lib/Target/X86/X86FastISel.cpp @@ -902,6 +902,8 @@ bool X86FastISel::X86SelectAddress(const Value *V, X86AddressMode &AM) { uint64_t Disp = (int32_t)AM.Disp; unsigned IndexReg = AM.IndexReg; unsigned Scale = AM.Scale; +MVT PtrVT = TLI.getValueType(DL, U->getType()).getSimpleVT(); + gep_type_iterator GTI = gep_type_begin(U); // Iterate through the indices, folding what we can. Constants can be // folded, and one dynamic index can be handled, if the scale is supported. @@ -937,7 +939,7 @@ bool X86FastISel::X86SelectAddress(const Value *V, X86AddressMode &AM) { (S == 1 || S == 2 || S == 4 || S == 8)) { // Scaled-index addressing. Scale = S; - IndexReg = getRegForGEPIndex(Op); + IndexReg = getRegForGEPIndex(PtrVT, Op); if (IndexReg == 0) return false; break; diff --git a/llvm/test/CodeGen/X86/issue56055.ll b/llvm/test/CodeGen/X86/issue56055.ll new file mode 100644 index 00..27eaf13e3b00be --- /dev/null +++ b/llvm/test/CodeGen/X86/issue56055.ll @@ -0,0 +1,81 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc -fast-isel < %s | FileCheck -check-prefixes=CHECK,FASTISEL %s +; RUN: llc < %s | FileCheck -check-prefixes=CHECK,SDAG %s + +target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-windows-msvc" + +define void @issue56055(ptr addrspace(270) %ptr, ptr %out) { +; CHECK-LABEL: issue56055: +; CHECK: # %bb.0: +; CHECK-NE
[llvm-branch-commits] [llvm] [AArch64][PAC] Move emission of LR checks in tail calls to AsmPrinter (PR #110705)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/110705 >From 089cc13bbd2cac76a2d3fc0b2f72b0bccda5b188 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 23 Sep 2024 19:51:55 +0300 Subject: [PATCH] [AArch64][PAC] Move emission of LR checks in tail calls to AsmPrinter Move the emission of the checks performed on the authenticated LR value during tail calls to AArch64AsmPrinter class, so that different checker sequences can be reused by pseudo instructions expanded there. This adds one more option to AuthCheckMethod enumeration, the generic XPAC variant which is not restricted to checking the LR register. --- llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp | 143 +++--- llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | 13 ++ llvm/lib/Target/AArch64/AArch64InstrInfo.td | 2 + .../lib/Target/AArch64/AArch64PointerAuth.cpp | 182 +- llvm/lib/Target/AArch64/AArch64PointerAuth.h | 40 ++-- llvm/lib/Target/AArch64/AArch64Subtarget.cpp | 2 - llvm/lib/Target/AArch64/AArch64Subtarget.h| 23 --- llvm/test/CodeGen/AArch64/ptrauth-ret-trap.ll | 36 ++-- .../AArch64/sign-return-address-tailcall.ll | 54 +++--- 9 files changed, 192 insertions(+), 303 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp index 6d2dd0ecbccf31..50502477706ccf 100644 --- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp +++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp @@ -153,6 +153,7 @@ class AArch64AsmPrinter : public AsmPrinter { void emitPtrauthCheckAuthenticatedValue(Register TestedReg, Register ScratchReg, AArch64PACKey::ID Key, + AArch64PAuth::AuthCheckMethod Method, bool ShouldTrap, const MCSymbol *OnFailure); @@ -1731,7 +1732,8 @@ unsigned AArch64AsmPrinter::emitPtrauthDiscriminator(uint16_t Disc, /// of proceeding to the next instruction (only if ShouldTrap is false). void AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue( Register TestedReg, Register ScratchReg, AArch64PACKey::ID Key, -bool ShouldTrap, const MCSymbol *OnFailure) { +AArch64PAuth::AuthCheckMethod Method, bool ShouldTrap, +const MCSymbol *OnFailure) { // Insert a sequence to check if authentication of TestedReg succeeded, // such as: // @@ -1757,38 +1759,70 @@ void AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue( //Lsuccess: // ... // - // This sequence is expensive, but we need more information to be able to - // do better. - // - // We can't TBZ the poison bit because EnhancedPAC2 XORs the PAC bits - // on failure. - // We can't TST the PAC bits because we don't always know how the address - // space is setup for the target environment (and the bottom PAC bit is - // based on that). - // Either way, we also don't always know whether TBI is enabled or not for - // the specific target environment. + // See the documentation on AuthCheckMethod enumeration constants for + // the specific code sequences that can be used to perform the check. + using AArch64PAuth::AuthCheckMethod; - unsigned XPACOpc = getXPACOpcodeForKey(Key); + if (Method == AuthCheckMethod::None) +return; + if (Method == AuthCheckMethod::DummyLoad) { +EmitToStreamer(MCInstBuilder(AArch64::LDRWui) + .addReg(getWRegFromXReg(ScratchReg)) + .addReg(TestedReg) + .addImm(0)); +assert(ShouldTrap && !OnFailure && "DummyLoad always traps on error"); +return; + } MCSymbol *SuccessSym = createTempSymbol("auth_success_"); + if (Method == AuthCheckMethod::XPAC || Method == AuthCheckMethod::XPACHint) { +// mov Xscratch, Xtested +emitMovXReg(ScratchReg, TestedReg); - // mov Xscratch, Xtested - emitMovXReg(ScratchReg, TestedReg); - - // xpac(i|d) Xscratch - EmitToStreamer(MCInstBuilder(XPACOpc).addReg(ScratchReg).addReg(ScratchReg)); +if (Method == AuthCheckMethod::XPAC) { + // xpac(i|d) Xscratch + unsigned XPACOpc = getXPACOpcodeForKey(Key); + EmitToStreamer( + MCInstBuilder(XPACOpc).addReg(ScratchReg).addReg(ScratchReg)); +} else { + // xpaclri + + // Note that this method applies XPAC to TestedReg instead of ScratchReg. + assert(TestedReg == AArch64::LR && + "XPACHint mode is only compatible with checking the LR register"); + assert((Key == AArch64PACKey::IA || Key == AArch64PACKey::IB) && + "XPACHint mode is only compatible with I-keys"); + EmitToStreamer(MCInstBuilder(AArch64::XPACLRI)); +} - // cmp Xtested, Xscratch - EmitToStreamer(MCInstBuilder(AArch64::SUBSXrs) - .addReg(AArch64::XZR) - .addReg(TestedReg) - .addR
[llvm-branch-commits] [llvm] [AArch64][PAC] Move emission of LR checks in tail calls to AsmPrinter (PR #110705)
https://github.com/atrosinenko edited https://github.com/llvm/llvm-project/pull/110705 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)
llvmbot wrote: @llvm/pr-subscribers-debuginfo Author: None (dlav-sc) Changes This patch adds CFI instructions in a function epilogue, that allows lldb to obtain a valid backtrace at the end of functions. --- Patch is 1004.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110810.diff 296 Files Affected: - (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+72) - (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.h (+2) - (modified) llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll (+10) - (modified) llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll (+36) - (modified) llvm/test/CodeGen/RISCV/addrspacecast.ll (+4) - (modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+85) - (modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+85) - (modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+144) - (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+88-8) - (modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32e.ll (+271) - (modified) llvm/test/CodeGen/RISCV/cm_mvas_mvsa.ll (+8-4) - (modified) llvm/test/CodeGen/RISCV/double-intrinsics.ll (+8) - (modified) llvm/test/CodeGen/RISCV/double-round-conv.ll (+50) - (modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll (+2) - (modified) llvm/test/CodeGen/RISCV/eh-dwarf-cfa.ll (+4) - (modified) llvm/test/CodeGen/RISCV/exception-pointer-register.ll (+8) - (modified) llvm/test/CodeGen/RISCV/float-intrinsics.ll (+8) - (modified) llvm/test/CodeGen/RISCV/float-round-conv.ll (+40) - (modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+176) - (modified) llvm/test/CodeGen/RISCV/frame-info.ll (+32) - (modified) llvm/test/CodeGen/RISCV/half-convert-strict.ll (+10) - (modified) llvm/test/CodeGen/RISCV/half-intrinsics.ll (+20) - (modified) llvm/test/CodeGen/RISCV/half-round-conv.ll (+80) - (modified) llvm/test/CodeGen/RISCV/hwasan-check-memaccess.ll (+2) - (modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts-vscale.ll (+3) - (modified) llvm/test/CodeGen/RISCV/kcfi-mir.ll (+2) - (modified) llvm/test/CodeGen/RISCV/large-stack.ll (+15) - (modified) llvm/test/CodeGen/RISCV/live-sp.mir (+2-1) - (modified) llvm/test/CodeGen/RISCV/llvm.exp10.ll (+120) - (modified) llvm/test/CodeGen/RISCV/local-stack-slot-allocation.ll (+14) - (modified) llvm/test/CodeGen/RISCV/lpad.ll (+8) - (modified) llvm/test/CodeGen/RISCV/miss-sp-restore-eh.ll (+5) - (modified) llvm/test/CodeGen/RISCV/nontemporal.ll (+60) - (modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+23) - (modified) llvm/test/CodeGen/RISCV/pr58025.ll (+1) - (modified) llvm/test/CodeGen/RISCV/pr58286.ll (+4) - (modified) llvm/test/CodeGen/RISCV/pr63365.ll (+1) - (modified) llvm/test/CodeGen/RISCV/pr69586.ll (+29) - (modified) llvm/test/CodeGen/RISCV/pr88365.ll (+3) - (modified) llvm/test/CodeGen/RISCV/prolog-epilogue.ll (+80) - (modified) llvm/test/CodeGen/RISCV/push-pop-opt-crash.ll (+27-25) - (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+344-44) - (modified) llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rv64-patchpoint.ll (+3) - (modified) llvm/test/CodeGen/RISCV/rv64-statepoint-call-lowering.ll (+21) - (modified) llvm/test/CodeGen/RISCV/rvv-cfi-info.ll (+18) - (modified) llvm/test/CodeGen/RISCV/rvv/abs-vp.ll (+2) - (modified) llvm/test/CodeGen/RISCV/rvv/access-fixed-objects-by-rvv.ll (+3) - (modified) llvm/test/CodeGen/RISCV/rvv/addi-scalable-offset.mir (+4) - (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-array.ll (+2) - (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-struct.ll (+2) - (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-vector-tuple.ll (+6) - (modified) llvm/test/CodeGen/RISCV/rvv/binop-splats.ll (+5) - (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-sdnode.ll (+5) - (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+20) - (modified) llvm/test/CodeGen/RISCV/rvv/bswap-sdnode.ll (+5) - (modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+21) - (modified) llvm/test/CodeGen/RISCV/rvv/callee-saved-regs.ll (+4) - (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll (+40) - (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv.ll (+16) - (modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+10) - (modified) llvm/test/CodeGen/RISCV/rvv/compressstore.ll (+2) - (modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+6) - (modified) llvm/test/CodeGen/RISCV/rvv/cttz-vp.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rvv/emergency-slot.mir (+14) - (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll (+16) - (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv32.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv64.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vector-i8-index-cornercase.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-binop-splats.ll (+4) - (modified)
[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)
llvmbot wrote: @llvm/pr-subscribers-backend-risc-v Author: None (dlav-sc) Changes This patch adds CFI instructions in a function epilogue, that allows lldb to obtain a valid backtrace at the end of functions. --- Patch is 1004.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110810.diff 296 Files Affected: - (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+72) - (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.h (+2) - (modified) llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll (+10) - (modified) llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll (+36) - (modified) llvm/test/CodeGen/RISCV/addrspacecast.ll (+4) - (modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+85) - (modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+85) - (modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+144) - (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+88-8) - (modified) llvm/test/CodeGen/RISCV/calling-conv-ilp32e.ll (+271) - (modified) llvm/test/CodeGen/RISCV/cm_mvas_mvsa.ll (+8-4) - (modified) llvm/test/CodeGen/RISCV/double-intrinsics.ll (+8) - (modified) llvm/test/CodeGen/RISCV/double-round-conv.ll (+50) - (modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll (+2) - (modified) llvm/test/CodeGen/RISCV/eh-dwarf-cfa.ll (+4) - (modified) llvm/test/CodeGen/RISCV/exception-pointer-register.ll (+8) - (modified) llvm/test/CodeGen/RISCV/float-intrinsics.ll (+8) - (modified) llvm/test/CodeGen/RISCV/float-round-conv.ll (+40) - (modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+176) - (modified) llvm/test/CodeGen/RISCV/frame-info.ll (+32) - (modified) llvm/test/CodeGen/RISCV/half-convert-strict.ll (+10) - (modified) llvm/test/CodeGen/RISCV/half-intrinsics.ll (+20) - (modified) llvm/test/CodeGen/RISCV/half-round-conv.ll (+80) - (modified) llvm/test/CodeGen/RISCV/hwasan-check-memaccess.ll (+2) - (modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts-vscale.ll (+3) - (modified) llvm/test/CodeGen/RISCV/kcfi-mir.ll (+2) - (modified) llvm/test/CodeGen/RISCV/large-stack.ll (+15) - (modified) llvm/test/CodeGen/RISCV/live-sp.mir (+2-1) - (modified) llvm/test/CodeGen/RISCV/llvm.exp10.ll (+120) - (modified) llvm/test/CodeGen/RISCV/local-stack-slot-allocation.ll (+14) - (modified) llvm/test/CodeGen/RISCV/lpad.ll (+8) - (modified) llvm/test/CodeGen/RISCV/miss-sp-restore-eh.ll (+5) - (modified) llvm/test/CodeGen/RISCV/nontemporal.ll (+60) - (modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+23) - (modified) llvm/test/CodeGen/RISCV/pr58025.ll (+1) - (modified) llvm/test/CodeGen/RISCV/pr58286.ll (+4) - (modified) llvm/test/CodeGen/RISCV/pr63365.ll (+1) - (modified) llvm/test/CodeGen/RISCV/pr69586.ll (+29) - (modified) llvm/test/CodeGen/RISCV/pr88365.ll (+3) - (modified) llvm/test/CodeGen/RISCV/prolog-epilogue.ll (+80) - (modified) llvm/test/CodeGen/RISCV/push-pop-opt-crash.ll (+27-25) - (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+344-44) - (modified) llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rv64-patchpoint.ll (+3) - (modified) llvm/test/CodeGen/RISCV/rv64-statepoint-call-lowering.ll (+21) - (modified) llvm/test/CodeGen/RISCV/rvv-cfi-info.ll (+18) - (modified) llvm/test/CodeGen/RISCV/rvv/abs-vp.ll (+2) - (modified) llvm/test/CodeGen/RISCV/rvv/access-fixed-objects-by-rvv.ll (+3) - (modified) llvm/test/CodeGen/RISCV/rvv/addi-scalable-offset.mir (+4) - (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-array.ll (+2) - (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-struct.ll (+2) - (modified) llvm/test/CodeGen/RISCV/rvv/alloca-load-store-vector-tuple.ll (+6) - (modified) llvm/test/CodeGen/RISCV/rvv/binop-splats.ll (+5) - (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-sdnode.ll (+5) - (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+20) - (modified) llvm/test/CodeGen/RISCV/rvv/bswap-sdnode.ll (+5) - (modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+21) - (modified) llvm/test/CodeGen/RISCV/rvv/callee-saved-regs.ll (+4) - (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll (+40) - (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv.ll (+16) - (modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+10) - (modified) llvm/test/CodeGen/RISCV/rvv/compressstore.ll (+2) - (modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+6) - (modified) llvm/test/CodeGen/RISCV/rvv/cttz-vp.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rvv/emergency-slot.mir (+14) - (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll (+16) - (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv32.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-int-rv64.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vector-i8-index-cornercase.ll (+8) - (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-binop-splats.ll (+4) - (modif
[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff ac123f934ecb8cc840f6ad33739a03c64ac351ca e6f1c940894489859e75b944978d42fcdffdec8e --extensions cpp,h -- llvm/lib/Target/RISCV/RISCVFrameLowering.cpp llvm/lib/Target/RISCV/RISCVFrameLowering.h llvm/lib/Target/RISCV/RISCVTargetMachine.cpp `` View the diff from clang-format here. ``diff diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp index 8254133d0e..66a693277a 100644 --- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp @@ -896,7 +896,8 @@ void RISCVFrameLowering::emitEpilogue(MachineFunction &MF, // therefor it is unecessary to place any CFI instructions after it. Just // deallocate stack if needed and return. if (StackSize != 0) - deallocateStack(MF, MBB, MBBI, DL, StackSize, RVFI->getLibCallStackSize()); + deallocateStack(MF, MBB, MBBI, DL, StackSize, + RVFI->getLibCallStackSize()); // Emit epilogue for shadow call stack. emitSCSEpilogue(MF, MBB, MBBI, DL); `` https://github.com/llvm/llvm-project/pull/110810 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)
@@ -62,6 +69,8 @@ define i32 @callee_float_in_regs(i32 %a, float %b) { ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:mv a0, a1 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:call __fixsfsi ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:add a0, s0, a0 +; ILP32E-FPELIM-SAVE-RESTORE-NEXT:.cfi_restore ra +; ILP32E-FPELIM-SAVE-RESTORE-NEXT:.cfi_restore s0 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:tail __riscv_restore_1 dlav-sc wrote: addressed. Yeah, I didn't take into account libcalls at all :) Anyway, I made a fix (5cf8f319fbc5304c21e05d5ec3d2aff1713bd071) and updated tests (e6f1c940894489859e75b944978d42fcdffdec8e). Could you take a look, please? However, tests don't look like you've expected, because in fact it is unnecessary to emit CFI instructions after `tail __riscv_restore_1`, which is considered as a terminator. Therefor, I just removed `.cfi_restore` and placed `.cfi_def_cfa_offset` instructions where they should be. https://github.com/llvm/llvm-project/pull/110810 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] workflows/release-binaries: Use static ZSTD on macOS (PR #110701)
keith wrote: Is the process for this that you will merge it at some point? Mainly wondering if I need to keep following it to make sure it makes it for the next one https://github.com/llvm/llvm-project/pull/110701 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] bdbe9f3 - Revert "[MLIR][XeGPU] Add sg_map attribute to support Work Item level semantiā¦"
Author: Chao Chen Date: 2024-10-02T10:44:00-05:00 New Revision: bdbe9f3fcc6e201eb2ec0340a80e6fd0fea8265a URL: https://github.com/llvm/llvm-project/commit/bdbe9f3fcc6e201eb2ec0340a80e6fd0fea8265a DIFF: https://github.com/llvm/llvm-project/commit/bdbe9f3fcc6e201eb2ec0340a80e6fd0fea8265a.diff LOG: Revert "[MLIR][XeGPU] Add sg_map attribute to support Work Item level semantiā¦" This reverts commit 3ca5d8082a0c6bd9520544ce3bca11bf3e02a5fa. Added: Modified: mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td mlir/lib/Dialect/XeGPU/IR/XeGPUDialect.cpp mlir/test/Dialect/XeGPU/XeGPUOps.mlir Removed: diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td index 2aaa7fd4221ab1..26eec0d4f2082a 100644 --- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td +++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td @@ -142,36 +142,4 @@ def XeGPU_FenceScopeAttr: let assemblyFormat = "$value"; } -def XeGPU_SGMapAttr : XeGPUAttr<"SGMap", "sg_map"> { - let summary = [{ -Describes the mapping between work item (WI) and the 2D tensor specified by the tensor descriptor. - }]; - let description = [{ -To distribute the XeGPU operation to work items, the tensor_desc must be specified with the sg_map -attribute at the tensor description creation time. -Within the `sg_map`, `wi_layout` specifies the layout of work items, -describing the mapping of work items to the tensor. -wi_layout[0] x wi_layout[1] must be equal to the total number of work items within a subgroup. -`wi_data` specifies the minimum number of data elements assigned to each work item for a single distribution. - -E.g., #xegpu.sg_map -In this example, the subgroup has 16 work items in wi_layout=[1, 16], -each accessing 1 element as specified by wi_data=[1, 1]. - -`wi_data[0] * wi_data[1]` can be greater than 1, meaning that each work item operates on multiple elements, -which is eventually lowered to "SIMT-flavor" vector, like SPIR-V vector or llvm vector, or packed to a storage data type. -The multiple elements indicated by `wi_data` can only be from one dimension and must be contiguous in the memory along either dimension. - }]; - let parameters = (ins -ArrayRefParameter<"uint32_t">:$wi_layout, -ArrayRefParameter<"uint32_t">:$wi_data); - - let builders = [ -AttrBuilder<(ins)> - ]; - - let hasCustomAssemblyFormat = 1; - let genVerifyDecl = 1; -} - #endif // MLIR_DIALECT_XEGPU_IR_XEGPUATTRS_TD diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td index 9f1b17721f2d56..0ce1211664b5ba 100644 --- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td +++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td @@ -63,7 +63,7 @@ def XeGPU_TensorDesc: XeGPUTypeDef<"TensorDesc", "tensor_desc", element-type ::= float-type | integer-type | index-type dim-list := (static-dim-list `x`)? static-dim-list ::= decimal-literal `x` decimal-literal -attr-list = (, memory_space = value)? (, arr_len = value)? (, boundary_check = value)? (, scattered = value)? (, sg_map `<` wi_layout = value, wi_data = value `>`)? +attr-list = (, memory_space = value)? (, arr_len = value)? (, boundary_check = value)? (, scattered = value)? ``` Examples: @@ -77,16 +77,12 @@ def XeGPU_TensorDesc: XeGPUTypeDef<"TensorDesc", "tensor_desc", // A TensorDesc with 8x16 f32 elements for a memory region in shared memory space. xegpu.tensor_desc<8x16xf32, #xegpu.tdesc_attr> - -// A TensorDesc with a sg_map -xegpu.tensor_desc<8x16xf32, #xegpu.sg_map> ``` }]; let parameters = (ins ArrayRefParameter<"int64_t">: $shape, "mlir::Type": $elementType, -OptionalParameter<"mlir::Attribute">: $encoding, -OptionalParameter<"mlir::Attribute">: $sg_map); +OptionalParameter<"mlir::Attribute">: $encoding); let builders = [ TypeBuilderWithInferredContext<(ins @@ -94,14 +90,12 @@ def XeGPU_TensorDesc: XeGPUTypeDef<"TensorDesc", "tensor_desc", "mlir::Type": $elementType, CArg<"int", "1">: $array_length, CArg<"bool", "true">: $boundary_check, - CArg<"xegpu::MemorySpace", "xegpu::MemorySpace::Global">:$memory_space, - CArg<"mlir::Attribute", "mlir::Attribute()">:$sg_map)>, + CArg<"xegpu::MemorySpace", "xegpu::MemorySpace::Global">:$memory_space)>, TypeBuilderWithInferredContext<(ins "llvm::ArrayRef": $shape, "mlir::Type": $elementType, CArg<"int", "1">: $chunk_size, - CArg<"xegpu::MemorySpace", "xegpu::MemorySpace::Global">:$memory_space, - CArg<"mlir::Attribute", "mlir::Attribute()">:$sg_map)> + CArg<"xegpu::MemorySpace", "x
[llvm-branch-commits] [libcxx] release/19.x: [libc++] Fix name of is_always_lock_free test which was never being run (#106077) (PR #110838)
llvmbot wrote: @llvm/pr-subscribers-libcxx Author: None (llvmbot) Changes Backport b45661953e6974782b0ccada6f0784db04bc693f Requested by: @ldionne --- Full diff: https://github.com/llvm/llvm-project/pull/110838.diff 1 Files Affected: - (renamed) libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.pass.cpp (+10-3) ``diff diff --git a/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp b/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.pass.cpp similarity index 94% rename from libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp rename to libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.pass.cpp index 2dc7f5c7654193..723e7b36f50319 100644 --- a/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp +++ b/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.pass.cpp @@ -5,8 +5,9 @@ // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // //===--===// -// + // UNSUPPORTED: c++03, c++11, c++14 +// XFAIL: LIBCXX-PICOLIBC-FIXME // // @@ -15,6 +16,10 @@ // // static constexpr bool is_always_lock_free; +// Ignore diagnostic about vector types changing the ABI on some targets, since +// that is irrelevant for this test. +// ADDITIONAL_COMPILE_FLAGS: -Wno-psabi + #include #include #include @@ -26,7 +31,8 @@ template void check_always_lock_free(std::atomic const& a) { using InfoT = LockFreeStatusInfo; - constexpr std::same_as decltype(auto) is_always_lock_free = std::atomic::is_always_lock_free; + constexpr auto is_always_lock_free = std::atomic::is_always_lock_free; + ASSERT_SAME_TYPE(decltype(is_always_lock_free), bool const); // If we know the status of T for sure, validate the exact result of the function. if constexpr (InfoT::status_known) { @@ -44,7 +50,8 @@ void check_always_lock_free(std::atomic const& a) { // In all cases, also sanity-check it based on the implication always-lock-free => lock-free. if (is_always_lock_free) { -std::same_as decltype(auto) is_lock_free = a.is_lock_free(); +auto is_lock_free = a.is_lock_free(); +ASSERT_SAME_TYPE(decltype(is_always_lock_free), bool const); assert(is_lock_free); } ASSERT_NOEXCEPT(a.is_lock_free()); `` https://github.com/llvm/llvm-project/pull/110838 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] release/19.x: [libc++] Fix name of is_always_lock_free test which was never being run (#106077) (PR #110838)
https://github.com/ldionne approved this pull request. https://github.com/llvm/llvm-project/pull/110838 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] release/19.x: [libc++] Fix name of is_always_lock_free test which was never being run (#106077) (PR #110838)
ldionne wrote: Impact: basically none, this adds test coverage we should always have had. https://github.com/llvm/llvm-project/pull/110838 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)
@@ -62,6 +69,8 @@ define i32 @callee_float_in_regs(i32 %a, float %b) { ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:mv a0, a1 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:call __fixsfsi ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:add a0, s0, a0 +; ILP32E-FPELIM-SAVE-RESTORE-NEXT:.cfi_restore ra +; ILP32E-FPELIM-SAVE-RESTORE-NEXT:.cfi_restore s0 ; ILP32E-FPELIM-SAVE-RESTORE-NEXT:tail __riscv_restore_1 kito-cheng wrote: Just put a link for the comment I leave in your prev PR to prevent I forgot this https://github.com/llvm/llvm-project/pull/110234#pullrequestreview-2342046243 https://github.com/llvm/llvm-project/pull/110810 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Preserve more flags when expanding gep (PR #110815)
@@ -4386,34 +4386,54 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // it. IdxN = DAG.getSExtOrTrunc(IdxN, dl, N.getValueType()); + SDNodeFlags ScaleFlags; + // The multiplication of an index by the type size does not wrap the + // pointer index type in a signed sense (mul nsw). + if (NW.hasNoUnsignedSignedWrap()) +ScaleFlags.setNoSignedWrap(true); + + // The multiplication of an index by the type size does not wrap the + // pointer index type in an unsigned sense (mul nuw). + if (NW.hasNoUnsignedWrap()) +ScaleFlags.setNoUnsignedWrap(true); goldsteinn wrote: `ScaleFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap())`; Likewise above. https://github.com/llvm/llvm-project/pull/110815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] workflows/release-binaries: Use static ZSTD on macOS (PR #110701)
keith wrote: Thanks! https://github.com/llvm/llvm-project/pull/110701 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] workflows/release-binaries: Use static ZSTD on macOS (PR #110701)
tstellar wrote: @keith I added this to the release milestone, so the release manager (@tru) will merge this before the next release. https://github.com/llvm/llvm-project/pull/110701 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] workflows/release-binaries: Use static ZSTD on macOS (PR #110701)
https://github.com/tstellar milestoned https://github.com/llvm/llvm-project/pull/110701 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits