[llvm-branch-commits] [compiler-rt] release/20.x: XFAIL malloc_zone.cpp for darwin/lsan (#131234) (PR #133006)
wrotki wrote: Closing this one as it's a bit messy. Opened new PR , cleaned up: https://github.com/llvm/llvm-project/pull/133832/files https://github.com/llvm/llvm-project/pull/133006 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang][CodeGen] Promote in complex compound divassign (PR #131453)
https://github.com/Maetveis updated https://github.com/llvm/llvm-project/pull/131453 From 9d50aa09e1f06ec145715896173750414ec75c0d Mon Sep 17 00:00:00 2001 From: Gergely Meszaros Date: Sat, 15 Mar 2025 12:53:32 +0100 Subject: [PATCH] [Clang][CodeGen] Promote in complex compound divassign When `-fcomplex-arithmetic=promoted` is set complex divassign `/=` should promote to a wider type the same way division (without assignment) does. Prior to this change, Smith's algorithm would be used for divassign. Fixes: https://github.com/llvm/llvm-project/issues/131129 --- clang/lib/CodeGen/CGExprComplex.cpp | 13 +- clang/test/CodeGen/cx-complex-range.c | 534 ++ 2 files changed, 221 insertions(+), 326 deletions(-) diff --git a/clang/lib/CodeGen/CGExprComplex.cpp b/clang/lib/CodeGen/CGExprComplex.cpp index 34f40feac7958..a7c8b96da6853 100644 --- a/clang/lib/CodeGen/CGExprComplex.cpp +++ b/clang/lib/CodeGen/CGExprComplex.cpp @@ -1214,13 +1214,16 @@ EmitCompoundAssignLValue(const CompoundAssignOperator *E, OpInfo.FPFeatures = E->getFPFeaturesInEffect(CGF.getLangOpts()); CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, OpInfo.FPFeatures); + const bool IsComplexDivisor = E->getOpcode() == BO_DivAssign && +E->getRHS()->getType()->isAnyComplexType(); + // Load the RHS and LHS operands. // __block variables need to have the rhs evaluated first, plus this should // improve codegen a little. QualType PromotionTypeCR; - PromotionTypeCR = getPromotionType(E->getStoredFPFeaturesOrDefault(), - E->getComputationResultType(), - /*IsComplexDivisor=*/false); + PromotionTypeCR = + getPromotionType(E->getStoredFPFeaturesOrDefault(), + E->getComputationResultType(), IsComplexDivisor); if (PromotionTypeCR.isNull()) PromotionTypeCR = E->getComputationResultType(); OpInfo.Ty = PromotionTypeCR; @@ -1228,7 +1231,7 @@ EmitCompoundAssignLValue(const CompoundAssignOperator *E, OpInfo.Ty->castAs()->getElementType(); QualType PromotionTypeRHS = getPromotionType(E->getStoredFPFeaturesOrDefault(), - E->getRHS()->getType(), /*IsComplexDivisor=*/false); + E->getRHS()->getType(), IsComplexDivisor); // The RHS should have been converted to the computation type. if (E->getRHS()->getType()->isRealFloatingType()) { @@ -1258,7 +1261,7 @@ EmitCompoundAssignLValue(const CompoundAssignOperator *E, SourceLocation Loc = E->getExprLoc(); QualType PromotionTypeLHS = getPromotionType(E->getStoredFPFeaturesOrDefault(), - E->getComputationLHSType(), /*IsComplexDivisor=*/false); + E->getComputationLHSType(), IsComplexDivisor); if (LHSTy->isAnyComplexType()) { ComplexPairTy LHSVal = EmitLoadOfLValue(LHS, Loc); if (!PromotionTypeLHS.isNull()) diff --git a/clang/test/CodeGen/cx-complex-range.c b/clang/test/CodeGen/cx-complex-range.c index 06a349fbc2a47..a724e1ca8cb6d 100644 --- a/clang/test/CodeGen/cx-complex-range.c +++ b/clang/test/CodeGen/cx-complex-range.c @@ -721,44 +721,32 @@ _Complex float divf(_Complex float a, _Complex float b) { // PRMTD-NEXT:[[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 // PRMTD-NEXT:[[B_IMAGP:%.*]] = getelementptr inbounds nuw { float, float }, ptr [[B]], i32 0, i32 1 // PRMTD-NEXT:[[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// PRMTD-NEXT:[[EXT:%.*]] = fpext float [[B_REAL]] to double +// PRMTD-NEXT:[[EXT1:%.*]] = fpext float [[B_IMAG]] to double // PRMTD-NEXT:[[TMP0:%.*]] = load ptr, ptr [[A_ADDR]], align 8 // PRMTD-NEXT:[[DOTREALP:%.*]] = getelementptr inbounds nuw { float, float }, ptr [[TMP0]], i32 0, i32 0 // PRMTD-NEXT:[[DOTREAL:%.*]] = load float, ptr [[DOTREALP]], align 4 // PRMTD-NEXT:[[DOTIMAGP:%.*]] = getelementptr inbounds nuw { float, float }, ptr [[TMP0]], i32 0, i32 1 // PRMTD-NEXT:[[DOTIMAG:%.*]] = load float, ptr [[DOTIMAGP]], align 4 -// PRMTD-NEXT:[[TMP1:%.*]] = call float @llvm.fabs.f32(float [[B_REAL]]) -// PRMTD-NEXT:[[TMP2:%.*]] = call float @llvm.fabs.f32(float [[B_IMAG]]) -// PRMTD-NEXT:[[ABS_CMP:%.*]] = fcmp ugt float [[TMP1]], [[TMP2]] -// PRMTD-NEXT:br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] -// PRMTD: abs_rhsr_greater_or_equal_abs_rhsi: -// PRMTD-NEXT:[[TMP3:%.*]] = fdiv float [[B_IMAG]], [[B_REAL]] -// PRMTD-NEXT:[[TMP4:%.*]] = fmul float [[TMP3]], [[B_IMAG]] -// PRMTD-NEXT:[[TMP5:%.*]] = fadd float [[B_REAL]], [[TMP4]] -// PRMTD-NEXT:[[TMP6:%.*]] = fmul float [[DOTIMAG]], [[TMP3]] -// PRMTD-NEXT:[[TMP7:%.*]] = fadd float [[DOTREAL]], [[TMP6]] -// PRMTD-NEXT:[[TMP8:%.*]] = fdiv float [[TMP7]], [[TMP5]] -// PRMTD-NEXT:[[TMP9:%.*]] = fmul float [[DOTREAL]], [[TMP3]] -// PRMTD-NEXT:[[TMP10:
[llvm-branch-commits] [llvm] llvm-reduce: Fix losing fast math flags in operands-to-args (PR #133421)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/133421 >From 02186b904f0aefc91d83431f1de4c08f5c11909f Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 28 Mar 2025 18:00:05 +0700 Subject: [PATCH] llvm-reduce: Fix losing fast math flags in operands-to-args --- .../operands-to-args-preserve-fmf.ll | 20 +++ .../deltas/ReduceOperandsToArgs.cpp | 4 2 files changed, 24 insertions(+) create mode 100644 llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll diff --git a/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll new file mode 100644 index 0..b4b19ca28dbb5 --- /dev/null +++ b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll @@ -0,0 +1,20 @@ +; RUN: llvm-reduce %s -o %t --abort-on-invalid-reduction --delta-passes=operands-to-args --test FileCheck --test-arg %s --test-arg --check-prefix=INTERESTING --test-arg --input-file +; RUN: FileCheck %s --input-file %t --check-prefix=REDUCED + +; INTERESTING-LABEL: define float @callee( +; INTERESTING: fadd float +define float @callee(float %a) { + %x = fadd float %a, 1.0 + ret float %x +} + +; INTERESTING-LABEL: define float @caller( +; INTERESTING: load float + +; REDUCED-LABEL: define float @caller(ptr %ptr, float %val, float %callee.ret1) { +; REDUCED: %callee.ret12 = call nnan nsz float @callee(float %val, float 0.00e+00) +define float @caller(ptr %ptr) { + %val = load float, ptr %ptr + %callee.ret = call nnan nsz float @callee(float %val) + ret float %callee.ret +} diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp index b9e07f2c9f63c..e1c1c9c7372f9 100644 --- a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp +++ b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp @@ -14,6 +14,7 @@ #include "llvm/IR/InstIterator.h" #include "llvm/IR/InstrTypes.h" #include "llvm/IR/Instructions.h" +#include "llvm/IR/Operator.h" #include "llvm/Transforms/Utils/BasicBlockUtils.h" #include "llvm/Transforms/Utils/Cloning.h" @@ -107,6 +108,9 @@ static void replaceFunctionCalls(Function *OldF, Function *NewF) { NewCI->setCallingConv(NewF->getCallingConv()); NewCI->setAttributes(CI->getAttributes()); +if (auto *FPOp = dyn_cast(NewCI)) + NewCI->setFastMathFlags(CI->getFastMathFlags()); + // Do the replacement for this use. if (!CI->use_empty()) CI->replaceAllUsesWith(NewCI); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [Metadata] Preserve MD_prof when merging instructions when one is missing. (PR #132433)
https://github.com/teresajohnson approved this pull request. https://github.com/llvm/llvm-project/pull/132433 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)
https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132291 >From 89ce369ab9b49b8c23a87ad0a888002dd85c094c Mon Sep 17 00:00:00 2001 From: Peter Collingbourne Date: Thu, 20 Mar 2025 15:12:39 -0700 Subject: [PATCH] Format Created using spr 1.3.6-beta.1 --- libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py index 630b90c9d77a6..927f8958f4b43 100644 --- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py +++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py @@ -30,7 +30,8 @@ # we exit. has_run_tests = False -has_execute_mi = 'execute_mi' in gdb.__dict__ +has_execute_mi = "execute_mi" in gdb.__dict__ + class CheckResult(gdb.Command): def __init__(self): ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/20.x: [lldb] Respect LaunchInfo::SetExecutable in ProcessLauncherPosixFork (#133093) (PR #134079)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/134079 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] llvm-reduce: Fix introducing unreachable code in simplify conditionals (PR #133842)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/133842?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#133842** https://app.graphite.dev/github/pr/llvm/llvm-project/133842?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/133842?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#133841** https://app.graphite.dev/github/pr/llvm/llvm-project/133841?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/133842 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [LoongArch][MC] Add relocation support for fld fst [x]vld [x]vst (PR #133836)
llvmbot wrote: @llvm/pr-subscribers-backend-loongarch Author: None (llvmbot) Changes Backport 725a7b664b92cd2e884806de5a08900b43d43cce d055e58334a91dcbaee22eb87bcdae85a1f33cd4 Requested by: @SixWeining --- Full diff: https://github.com/llvm/llvm-project/pull/133836.diff 6 Files Affected: - (modified) llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td (+2-2) - (modified) llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td (+2-2) - (modified) llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td (+2-2) - (modified) llvm/test/MC/LoongArch/Relocations/relocations.s (+30) - (modified) llvm/test/MC/LoongArch/lasx/invalid-imm.s (+6-6) - (modified) llvm/test/MC/LoongArch/lsx/invalid-imm.s (+6-6) ``diff diff --git a/llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td b/llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td index f66f620ca8b26..ce42236895c76 100644 --- a/llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td +++ b/llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td @@ -206,7 +206,7 @@ class FP_LOAD_3R op, RegisterClass rc = FPR32> : FPFmtMEM; class FP_LOAD_2RI12 op, RegisterClass rc = FPR32> -: FPFmt2RI12; } // hasSideEffects = 0, mayLoad = 1, mayStore = 0 @@ -215,7 +215,7 @@ class FP_STORE_3R op, RegisterClass rc = FPR32> : FPFmtMEM; class FP_STORE_2RI12 op, RegisterClass rc = FPR32> -: FPFmt2RI12; } // hasSideEffects = 0, mayLoad = 0, mayStore = 1 diff --git a/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td b/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td index 24b5ed5a9344f..7022fddf34100 100644 --- a/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td +++ b/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td @@ -186,10 +186,10 @@ class LASX2RI10_Load op, Operand ImmOpnd = simm10_lsl2> class LASX2RI11_Load op, Operand ImmOpnd = simm11_lsl1> : Fmt2RI11_XRI; -class LASX2RI12_Load op, Operand ImmOpnd = simm12> +class LASX2RI12_Load op, Operand ImmOpnd = simm12_addlike> : Fmt2RI12_XRI; -class LASX2RI12_Store op, Operand ImmOpnd = simm12> +class LASX2RI12_Store op, Operand ImmOpnd = simm12_addlike> : Fmt2RI12_XRI; diff --git a/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td b/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td index d2063a8aaae9b..e37de4f545a2a 100644 --- a/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td +++ b/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td @@ -374,10 +374,10 @@ class LSX2RI10_Load op, Operand ImmOpnd = simm10_lsl2> class LSX2RI11_Load op, Operand ImmOpnd = simm11_lsl1> : Fmt2RI11_VRI; -class LSX2RI12_Load op, Operand ImmOpnd = simm12> +class LSX2RI12_Load op, Operand ImmOpnd = simm12_addlike> : Fmt2RI12_VRI; -class LSX2RI12_Store op, Operand ImmOpnd = simm12> +class LSX2RI12_Store op, Operand ImmOpnd = simm12_addlike> : Fmt2RI12_VRI; diff --git a/llvm/test/MC/LoongArch/Relocations/relocations.s b/llvm/test/MC/LoongArch/Relocations/relocations.s index 091dce200b7de..f91a941295d9e 100644 --- a/llvm/test/MC/LoongArch/Relocations/relocations.s +++ b/llvm/test/MC/LoongArch/Relocations/relocations.s @@ -308,3 +308,33 @@ pcaddi $t1, %desc_pcrel_20(foo) # RELOC: R_LARCH_TLS_DESC_PCREL20_S2 foo 0x0 # INSTR: pcaddi $t1, %desc_pcrel_20(foo) # FIXUP: fixup A - offset: 0, value: %desc_pcrel_20(foo), kind: FK_NONE + +fld.s $ft1, $a0, %pc_lo12(foo) +# RELOC: R_LARCH_PCALA_LO12 foo 0x0 +# INSTR: fld.s $ft1, $a0, %pc_lo12(foo) +# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE + +fst.d $ft1, $a0, %pc_lo12(foo) +# RELOC: R_LARCH_PCALA_LO12 foo 0x0 +# INSTR: fst.d $ft1, $a0, %pc_lo12(foo) +# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE + +vld $vr9, $a0, %pc_lo12(foo) +# RELOC: R_LARCH_PCALA_LO12 foo 0x0 +# INSTR: vld $vr9, $a0, %pc_lo12(foo) +# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE + +vst $vr9, $a0, %pc_lo12(foo) +# RELOC: R_LARCH_PCALA_LO12 foo 0x0 +# INSTR: vst $vr9, $a0, %pc_lo12(foo) +# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE + +xvld $xr9, $a0, %pc_lo12(foo) +# RELOC: R_LARCH_PCALA_LO12 foo 0x0 +# INSTR: xvld $xr9, $a0, %pc_lo12(foo) +# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE + +xvst $xr9, $a0, %pc_lo12(foo) +# RELOC: R_LARCH_PCALA_LO12 foo 0x0 +# INSTR: xvst $xr9, $a0, %pc_lo12(foo) +# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE diff --git a/llvm/test/MC/LoongArch/lasx/invalid-imm.s b/llvm/test/MC/LoongArch/lasx/invalid-imm.s index 6f64a6f87802b..adfd35367d7ba 100644 --- a/llvm/test/MC/LoongArch/lasx/invalid-imm.s +++ b/llvm/test/MC/LoongArch/lasx/invalid-imm.s @@ -1167,22 +1167,22 @@ xvldrepl.h $xr0, $a0, 2048 ## simm12 xvldrepl.b $xr0, $a0, -2049 -# CHECK: :[[#@LINE-1]]:23: error: immediate must be an integer in the range [-2048, 2047] +# CHECK: :[[#@LINE-1]]:23: error: operand must be a symbol with modifier (e.g. %pc_lo12) or an integer in the range [-2048, 2047] xvldrepl.b $xr0, $a0, 2048 -# CHECK: :[[#@LINE-1]]
[llvm-branch-commits] AArch64: Relax x16/x17 constraint on AUT in certain cases. (PR #132857)
https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132857 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: reformulate the state for data-flow analysis (PR #131898)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/131898 >From da27c6c3ddaf09a97fff98365b457eb1e86828b0 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 17 Mar 2025 22:27:53 +0300 Subject: [PATCH 1/2] [BOLT] Gadget scanner: reformulate the state for data-flow analysis In preparation for implementing support for detection of non-protected call instructions, refine the definition of state which is computed for each register by data-flow analysis. Explicitly marking the registers which are known to be trusted at function entry is crucial for finding non-protected calls. In addition, it fixes less-common false negatives for pac-ret, such as `ret x1` in `f_nonx30_ret_non_auted` test case. --- bolt/include/bolt/Core/MCPlusBuilder.h| 10 ++ bolt/include/bolt/Passes/PAuthGadgetScanner.h | 7 +- bolt/lib/Passes/PAuthGadgetScanner.cpp| 129 +++--- .../Target/AArch64/AArch64MCPlusBuilder.cpp | 4 + .../AArch64/gs-pacret-autiasp.s | 19 ++- .../AArch64/gs-pacret-multi-bb.s | 3 +- 6 files changed, 104 insertions(+), 68 deletions(-) diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index b285138b77fe7..76ea2489e7038 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -551,6 +551,16 @@ class MCPlusBuilder { return Analysis->isReturn(Inst); } + /// Returns the registers that are trusted at function entry. + /// + /// Each register should be treated as if a successfully authenticated + /// pointer was written to it before entering the function (i.e. the + /// pointer is safe to jump to as well as to be signed). + virtual SmallVector getTrustedLiveInRegs() const { +llvm_unreachable("not implemented"); +return {}; + } + virtual ErrorOr getAuthenticatedReg(const MCInst &Inst) const { llvm_unreachable("not implemented"); return getNoRegister(); diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index f102f1080e2e8..404dde2901767 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -209,13 +209,12 @@ struct Report { struct GadgetReport : public Report { const GadgetKind &Kind; - SmallVector AffectedRegisters; + SmallVector AffectedRegisters; std::vector OverwritingInstrs; GadgetReport(const GadgetKind &Kind, MCInstReference Location, - const BitVector &AffectedRegisters) - : Report(Location), Kind(Kind), -AffectedRegisters(AffectedRegisters.set_bits()) {} + MCPhysReg AffectedRegister) + : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {} void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index 163e26c68cb9a..93a452b224233 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -126,18 +126,16 @@ class TrackedRegisters { // The security property that is checked is: // When a register is used as the address to jump to in a return instruction, -// that register must either: -// (a) never be changed within this function, i.e. have the same value as when -// the function started, or +// that register must be safe-to-dereference. It must either +// (a) be safe-to-dereference at function entry and never be changed within this +// function, i.e. have the same value as when the function started, or // (b) the last write to the register must be by an authentication instruction. // This property is checked by using dataflow analysis to keep track of which -// registers have been written (def-ed), since last authenticated. Those are -// exactly the registers containing values that should not be trusted (as they -// could have changed since the last time they were authenticated). For pac-ret, -// any return instruction using such a register is a gadget to be reported. For -// PAuthABI, probably at least any indirect control flow using such a register -// should be reported. +// registers have been written (def-ed), since last authenticated. For pac-ret, +// any return instruction using a register which is not safe-to-dereference is +// a gadget to be reported. For PAuthABI, probably at least any indirect control +// flow using such a register should be reported. // Furthermore, when producing a diagnostic for a found non-pac-ret protected // return, the analysis also lists the last instructions that wrote to the @@ -156,10 +154,29 @@ class TrackedRegisters { //in the gadgets to be reported. This information is used in the second run //to also track which instructions last wrote to those registers. +/// A state representing which registers are safe to use by an instruction +/// at a given program p
[llvm-branch-commits] [libcxxabi] [release/18.x][backport][libc++abi] Use __has_feature check to enable usage of thread_local for exception storage (PR #132241)
ldionne wrote: (I'm going to tentatively close this since as I said we're not cherry-picking stuff back to LLVM 18 anymore, please reopen for more discussion) https://github.com/llvm/llvm-project/pull/132241 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
@@ -5026,10 +5026,24 @@ calculateRegisterUsage(VPlan &Plan, ArrayRef VFs, // even in the scalar case. RegUsage[ClassID] += 1; } else { +// The output from scaled phis and scaled reductions actually have +// fewer lanes than the VF. +auto VF = VFs[J]; SamTebbs33 wrote: Done. https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] llvm-reduce: Reduce global variable code model (PR #133865)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/133865 The current API doesn't have a way to unset it. The query returns an optional, but the set doesn't. Alternatively I could switch the set to also use optional. >From 0336fe4e9c81d14560478be572b3ab970325552f Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 1 Apr 2025 12:38:18 +0700 Subject: [PATCH] llvm-reduce: Reduce global variable code model The current API doesn't have a way to unset it. The query returns an optional, but the set doesn't. Alternatively I could switch the set to also use optional. --- llvm/include/llvm/IR/GlobalVariable.h | 4 llvm/lib/IR/Globals.cpp| 9 + .../tools/llvm-reduce/reduce-code-model.ll | 18 ++ .../llvm-reduce/deltas/ReduceGlobalValues.cpp | 3 ++- 4 files changed, 33 insertions(+), 1 deletion(-) create mode 100644 llvm/test/tools/llvm-reduce/reduce-code-model.ll diff --git a/llvm/include/llvm/IR/GlobalVariable.h b/llvm/include/llvm/IR/GlobalVariable.h index 83e484816d7d4..5ea5d3b11cd9a 100644 --- a/llvm/include/llvm/IR/GlobalVariable.h +++ b/llvm/include/llvm/IR/GlobalVariable.h @@ -289,6 +289,10 @@ class GlobalVariable : public GlobalObject, public ilist_node { /// void setCodeModel(CodeModel::Model CM); + /// Remove the code model for this global. + /// + void clearCodeModel(); + // Methods for support type inquiry through isa, cast, and dyn_cast: static bool classof(const Value *V) { return V->getValueID() == Value::GlobalVariableVal; diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp index 8ca44719a3f94..401f8ac58bce8 100644 --- a/llvm/lib/IR/Globals.cpp +++ b/llvm/lib/IR/Globals.cpp @@ -557,6 +557,15 @@ void GlobalVariable::setCodeModel(CodeModel::Model CM) { assert(getCodeModel() == CM && "Code model representation error!"); } +void GlobalVariable::clearCodeModel() { + unsigned CodeModelData = 0; + unsigned OldData = getGlobalValueSubClassData(); + unsigned NewData = (OldData & ~(CodeModelMask << CodeModelShift)) | + (CodeModelData << CodeModelShift); + setGlobalValueSubClassData(NewData); + assert(getCodeModel() == std::nullopt && "Code model representation error!"); +} + //===--===// // GlobalAlias Implementation //===--===// diff --git a/llvm/test/tools/llvm-reduce/reduce-code-model.ll b/llvm/test/tools/llvm-reduce/reduce-code-model.ll new file mode 100644 index 0..898f5995d9826 --- /dev/null +++ b/llvm/test/tools/llvm-reduce/reduce-code-model.ll @@ -0,0 +1,18 @@ +; RUN: llvm-reduce -abort-on-invalid-reduction --delta-passes=global-values --test FileCheck --test-arg --check-prefix=INTERESTING --test-arg %s --test-arg --input-file %s -o %t.0 +; RUN: FileCheck --implicit-check-not=define --check-prefix=RESULT %s < %t.0 + +; INTERESTING: @code_model_large_keep = global i32 0, code_model "large", align 4 +; INTERESTING @code_model_large_drop = global i32 0 + +; RESULT: @code_model_large_keep = global i32 0, code_model "large", align 4{{$}} +; RESULT: @code_model_large_drop = global i32 0, align 4{{$}} +@code_model_large_keep = global i32 0, code_model "large", align 4 +@code_model_large_drop = global i32 0, code_model "large", align 4 + +; INTERESTING: @code_model_tiny_keep = global i32 0, code_model "tiny", align 4 +; INTERESTING @code_model_tiny_drop = global i32 0 + +; RESULT: @code_model_tiny_keep = global i32 0, code_model "tiny", align 4{{$}} +; RESULT: @code_model_tiny_drop = global i32 0, align 4{{$}} +@code_model_tiny_keep = global i32 0, code_model "tiny", align 4 +@code_model_tiny_drop = global i32 0, code_model "tiny", align 4 diff --git a/llvm/tools/llvm-reduce/deltas/ReduceGlobalValues.cpp b/llvm/tools/llvm-reduce/deltas/ReduceGlobalValues.cpp index e56876c38032e..659bf8dd23eff 100644 --- a/llvm/tools/llvm-reduce/deltas/ReduceGlobalValues.cpp +++ b/llvm/tools/llvm-reduce/deltas/ReduceGlobalValues.cpp @@ -70,7 +70,8 @@ void llvm::reduceGlobalValuesDeltaPass(Oracle &O, ReducerWorkItem &Program) { if (GVar->isExternallyInitialized() && !O.shouldKeep()) GVar->setExternallyInitialized(false); - // TODO: Reduce code model + if (GVar->getCodeModel() && !O.shouldKeep()) +GVar->clearCodeModel(); } } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] release/20.x: [libcxx] [test] Fix restoring LLVM_DIR and Clang_DIR (#132838) (PR #133153)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/133153 >From 44a6f6abbdb6f0eebfaf1ad6f601c29f80782de7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20Storsj=C3=B6?= Date: Wed, 26 Mar 2025 22:13:28 +0200 Subject: [PATCH] [libcxx] [test] Fix restoring LLVM_DIR and Clang_DIR (#132838) In 664f345cd53d1f624d94f9889a1c9fff803e3391, a fix was introduced, attempting to restore LLVM_DIR and Clang_DIR after doing find_package(Clang). However, 6775285e7695f2d45cf455f5d31b2c9fa9362d3d added a return if the clangTidy target wasn't found. If this is hit, we don't restore LLVM_DIR and Clang_DIR, which causes strange effects if CMake is rerun a second time. Move the code for restoring LLVM_DIR and Clang_DIR to directly after the find_package calls, to make sure they are restored, regardless of the find_package outcome. (cherry picked from commit 51bceb46f8eeb7c3d060387be315ca41855933c2) --- libcxx/test/tools/clang_tidy_checks/CMakeLists.txt | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/libcxx/test/tools/clang_tidy_checks/CMakeLists.txt b/libcxx/test/tools/clang_tidy_checks/CMakeLists.txt index 0f8f0e8864d0f..da045fac92ce4 100644 --- a/libcxx/test/tools/clang_tidy_checks/CMakeLists.txt +++ b/libcxx/test/tools/clang_tidy_checks/CMakeLists.txt @@ -8,6 +8,10 @@ set(Clang_DIR_SAVE ${Clang_DIR}) # versions must match. Otherwise there likely will be ODR-violations. This had # led to crashes and incorrect output of the clang-tidy based checks. find_package(Clang ${CMAKE_CXX_COMPILER_VERSION}) + +set(LLVM_DIR "${LLVM_DIR_SAVE}" CACHE PATH "The directory containing a CMake configuration file for LLVM." FORCE) +set(Clang_DIR "${Clang_DIR_SAVE}" CACHE PATH "The directory containing a CMake configuration file for Clang." FORCE) + if(NOT Clang_FOUND) message(STATUS "Clang-tidy tests are disabled since the " "Clang development package is unavailable.") @@ -19,9 +23,6 @@ if(NOT TARGET clangTidy) return() endif() -set(LLVM_DIR "${LLVM_DIR_SAVE}" CACHE PATH "The directory containing a CMake configuration file for LLVM." FORCE) -set(Clang_DIR "${Clang_DIR_SAVE}" CACHE PATH "The directory containing a CMake configuration file for Clang." FORCE) - message(STATUS "Found system-installed LLVM ${LLVM_PACKAGE_VERSION} with headers in ${LLVM_INCLUDE_DIRS}") set(CMAKE_CXX_STANDARD 20) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][StaticDataSplitter]Support constant pool partitioning (PR #129781)
@@ -0,0 +1,141 @@ +; RUN: llc -mtriple=aarch64 -enable-split-machine-functions \ +; RUN: -partition-static-data-sections=true -function-sections=true \ +; RUN: -unique-section-names=false \ +; RUN: %s -o - 2>&1 | FileCheck %s --dump-input=always + +; Repeat the RUN command above for big-endian systems. +; RUN: llc -mtriple=aarch64_be -enable-split-machine-functions \ +; RUN: -partition-static-data-sections=true -function-sections=true \ +; RUN: -unique-section-names=false \ +; RUN: %s -o - 2>&1 | FileCheck %s --dump-input=always + +; Tests that constant pool hotness is aggregated across the module. The +; static-data-splitter processes data from cold_func first, unprofiled_func +; secondly, and then hot_func. Specifically, tests that +; - If a constant is accessed by hot functions, all constant pools for this +; constant (e.g., from an unprofiled function, or cold function) should have +; `.hot` suffix. +; - Similarly if a constant is accessed by both cold function and un-profiled +; function, constant pools for this constant should not have `.unlikely` suffix. + +; CHECK: .section .rodata.cst8.hot,"aM",@progbits,8 +; CHECK: .LCPI0_0: mingmingl-llvm wrote: Yes. Constant pools for the same function are emitted back to back and labels are named like `_LCPI_`. Grouped them by functions and use `CHECK-NEXT` in each group to make the test tighter. https://github.com/llvm/llvm-project/pull/129781 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds (PR #132353)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/132353 >From a8155cf5b7847a041be8d4252b20cae01d305404 Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Fri, 21 Mar 2025 03:33:02 -0400 Subject: [PATCH] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds For flat memory instructions where the address is supplied as a base address register with an immediate offset, the memory aperture test ignores the immediate offset. Currently, ISel does not respect that, which leads to miscompilations where valid input programs crash when the address computation relies on the immediate offset to get the base address in the proper memory aperture. Global or scratch instructions are not affected. This patch only selects flat instructions with immediate offsets from address computations with the inbounds flag: If the address computation does not leave the bounds of the allocated object, it cannot leave the bounds of the memory aperture and is therefore safe to handle with an immediate offset. It also adds the inbounds flag to DAG nodes resulting from transformations: - Address computations resulting from getObjectPtrOffset. As far as I can tell, this function is only used to compute addresses within accessed memory ranges, e.g., for loads and stores that are split during legalization. - Reassociated inbounds adds. If both involved operations are inbounds, then so are operations after the transformation. - Address computations in the SelectionDAG lowering of the memcpy/move/set intrinsics. Base and result of the address arithmetic there are accessed, so the operation must be inbounds. It might make sense to separate these changes into their own PR, but I don't see a way to test them without adding a use of the inbounds SDAG flag. Affected tests: - CodeGen/AMDGPU/fold-gep-offset.ll: Offsets are no longer wrongly folded, added new positive tests where we still do fold them. - Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll: Offset folding doesn't seem integral to this test, so the test is not changed to make offset folding still happen. - CodeGen/AMDGPU/loop-prefetch-data.ll: loop-reduce prefers to base addresses on the potentially OOB addresses used for prefetching for memory accesses, that might be a separate issue to look into. - Added memset tests to CodeGen/AMDGPU/memintrinsic-unroll.ll to make sure that offsets in the memset DAG lowering are still folded properly. A similar patch for GlobalISel will follow. Fixes SWDEV-516125. --- llvm/include/llvm/CodeGen/SelectionDAG.h | 12 +- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 9 +- .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp | 12 +- llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp | 140 --- llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll | 374 +- .../test/CodeGen/AMDGPU/loop-prefetch-data.ll | 17 +- .../CodeGen/AMDGPU/memintrinsic-unroll.ll | 241 +++ .../InferAddressSpaces/AMDGPU/flat_atomic.ll | 6 +- 8 files changed, 717 insertions(+), 94 deletions(-) diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h b/llvm/include/llvm/CodeGen/SelectionDAG.h index 15a2370e5d8b8..aa3668d3e9aae 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAG.h +++ b/llvm/include/llvm/CodeGen/SelectionDAG.h @@ -1069,7 +1069,8 @@ class SelectionDAG { SDValue EVL); /// Returns sum of the base pointer and offset. - /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap by default. + /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap and InBounds by + /// default. SDValue getMemBasePlusOffset(SDValue Base, TypeSize Offset, const SDLoc &DL, const SDNodeFlags Flags = SDNodeFlags()); SDValue getMemBasePlusOffset(SDValue Base, SDValue Offset, const SDLoc &DL, @@ -1077,15 +1078,18 @@ class SelectionDAG { /// Create an add instruction with appropriate flags when used for /// addressing some offset of an object. i.e. if a load is split into multiple - /// components, create an add nuw from the base pointer to the offset. + /// components, create an add nuw inbounds from the base pointer to the + /// offset. SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, TypeSize Offset) { -return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap); +return getMemBasePlusOffset( +Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds); } SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, SDValue Offset) { // The object itself can't wrap around the address space, so it shouldn't be // possible for the adds of the offsets to the split parts to overflow. -return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap); +return getMemBasePlusOffset( +Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds); } /// Return a new CALLSEQ_START node, that starts new call fram
[llvm-branch-commits] [clang] release/20.x: [clang] Do not infer lifetimebound for functions with void return type (#131997) (PR #133997)
https://github.com/AaronBallman approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/133997 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [BPF] Add default cpu change in ReleaseNotes (PR #131691)
https://github.com/yonghong-song updated https://github.com/llvm/llvm-project/pull/131691 >From 70d891fcda64891e21129d6cc843ffca073fa255 Mon Sep 17 00:00:00 2001 From: Yonghong Song Date: Mon, 17 Mar 2025 15:54:25 -0700 Subject: [PATCH] [BPF] Add default cpu change in ReleaseNotes The pull request [1] changed bpf default cpu from -mcpu=v1 to -mcpu=v3 in clang20. Recently in [1], Yuval Deutscher suggested to add an entry to clang20 ReleaseNotes so users can easily find the change from documentation. [1] https://github.com/llvm/llvm-project/pull/107008 --- clang/docs/ReleaseNotes.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 02292c10e6964..a0e0128bcee2a 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1299,6 +1299,11 @@ AVR Support - Reject C/C++ compilation for avr1 devices which have no SRAM. +BPF Support +^^^ + +- Make ``-mcpu=v3`` as the default. + DWARF Support in Clang -- ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/131899 >From 56106534f70a4be70f9edea3c6f631e286ac6340 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 18 Mar 2025 21:32:11 +0300 Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect non-protected indirect calls --- bolt/include/bolt/Core/MCPlusBuilder.h| 10 + bolt/lib/Passes/PAuthGadgetScanner.cpp| 33 +- .../Target/AArch64/AArch64MCPlusBuilder.cpp | 42 ++ .../binary-analysis/AArch64/gs-pauth-calls.s | 676 ++ 4 files changed, 757 insertions(+), 4 deletions(-) create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-calls.s diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 76ea2489e7038..b3d54ccd5955d 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -577,6 +577,16 @@ class MCPlusBuilder { return getNoRegister(); } + /// Returns the register used as call destination, or no-register, if not + /// an indirect call. Sets IsAuthenticatedInternally if the instruction + /// accepts signed pointer as its operand and authenticates it internally. + virtual MCPhysReg + getRegUsedAsCallDest(const MCInst &Inst, + bool &IsAuthenticatedInternally) const { +llvm_unreachable("not implemented"); +return getNoRegister(); + } + virtual bool isTerminator(const MCInst &Inst) const; virtual bool isNoop(const MCInst &Inst) const { diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index c81a586b02771..b8a0a80215ce2 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -382,11 +382,11 @@ class PacRetAnalysis public: std::vector - getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF, - const ArrayRef UsedDirtyRegs) const { + getLastClobberingInsts(const MCInst &Inst, BinaryFunction &BF, + const ArrayRef UsedDirtyRegs) { if (RegsToTrackInstsFor.empty()) return {}; -auto MaybeState = getStateAt(Ret); +auto MaybeState = getStateBefore(Inst); if (!MaybeState) llvm_unreachable("Expected State to be present"); const State &S = *MaybeState; @@ -434,6 +434,29 @@ static std::shared_ptr tryCheckReturn(const BinaryContext &BC, return std::make_shared(RetKind, Inst, RetReg); } +static std::shared_ptr tryCheckCall(const BinaryContext &BC, +const MCInstReference &Inst, +const State &S) { + static const GadgetKind CallKind("non-protected call found"); + if (!BC.MIB->isCall(Inst) && !BC.MIB->isBranch(Inst)) +return nullptr; + + bool IsAuthenticated = false; + MCPhysReg DestReg = BC.MIB->getRegUsedAsCallDest(Inst, IsAuthenticated); + if (IsAuthenticated || DestReg == BC.MIB->getNoRegister()) +return nullptr; + + LLVM_DEBUG({ +traceInst(BC, "Found call inst", Inst); +traceReg(BC, "Call destination reg", DestReg); +traceRegMask(BC, "SafeToDerefRegs", S.SafeToDerefRegs); + }); + if (S.SafeToDerefRegs[DestReg]) +return nullptr; + + return std::make_shared(CallKind, Inst, DestReg); +} + FunctionAnalysisResult Analysis::computeDfState(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) { @@ -450,10 +473,12 @@ Analysis::computeDfState(BinaryFunction &BF, for (BinaryBasicBlock &BB : BF) { for (int64_t I = 0, E = BB.size(); I < E; ++I) { MCInstReference Inst(&BB, I); - const State &S = *PRA.getStateAt(Inst); + const State &S = *PRA.getStateBefore(Inst); if (auto Report = tryCheckReturn(BC, Inst, S)) Result.Diagnostics.push_back(Report); + if (auto Report = tryCheckCall(BC, Inst, S)) +Result.Diagnostics.push_back(Report); } } diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp index d238a1df5c7d7..9ce1514639f95 100644 --- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp +++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp @@ -277,6 +277,48 @@ class AArch64MCPlusBuilder : public MCPlusBuilder { } } + MCPhysReg + getRegUsedAsCallDest(const MCInst &Inst, + bool &IsAuthenticatedInternally) const override { +assert(isCall(Inst) || isBranch(Inst)); +IsAuthenticatedInternally = false; + +switch (Inst.getOpcode()) { +case AArch64::B: +case AArch64::BL: + assert(Inst.getOperand(0).isExpr()); + return getNoRegister(); +case AArch64::Bcc: +case AArch64::CBNZW: +case AArch64::CBNZX: +case AArch64::CBZW: +case AArch64::CBZX: + assert(Inst.getOperand(1).isExpr()); + return getNoRegister(); +case AArch64::TBNZW: +case AArch64::TBNZX: +case AArch64::TBZW: +case AArch64::TBZX: + assert(Ins
[llvm-branch-commits] [clang] [Driver][RISCV] Integrate RISCV target in baremetal toolchain object and deprecate RISCVToolchain object.(3/3) (PR #121831)
https://github.com/quic-garvgupt edited https://github.com/llvm/llvm-project/pull/121831 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GlobalISel] Combine redundant sext_inreg (PR #131624)
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/131624 >From 3f3c67934d0c9ea34c11cbd24becc24541baf567 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Mon, 17 Mar 2025 13:54:59 +0100 Subject: [PATCH 1/2] [GlobalISel] Combine redundant sext_inreg --- .../llvm/CodeGen/GlobalISel/CombinerHelper.h | 3 + .../include/llvm/Target/GlobalISel/Combine.td | 9 +- .../GlobalISel/CombinerHelperCasts.cpp| 27 +++ .../combine-redundant-sext-inreg.mir | 164 ++ .../combine-sext-trunc-sextinreg.mir | 87 ++ .../CodeGen/AMDGPU/GlobalISel/llvm.abs.ll | 5 - 6 files changed, 289 insertions(+), 6 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir create mode 100644 llvm/test/CodeGen/AMDGPU/GlobalISel/combine-sext-trunc-sextinreg.mir diff --git a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h index 9b78342c8fc39..5778377d125a8 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h @@ -994,6 +994,9 @@ class CombinerHelper { // overflow sub bool matchSuboCarryOut(const MachineInstr &MI, BuildFnTy &MatchInfo) const; + // (sext_inreg (sext_inreg x, K0), K1) + void applyRedundantSextInReg(MachineInstr &Root, MachineInstr &Other) const; + private: /// Checks for legality of an indexed variant of \p LdSt. bool isIndexedLoadStoreLegal(GLoadStore &LdSt) const; diff --git a/llvm/include/llvm/Target/GlobalISel/Combine.td b/llvm/include/llvm/Target/GlobalISel/Combine.td index 660b03080f92e..6a0ff683a4647 100644 --- a/llvm/include/llvm/Target/GlobalISel/Combine.td +++ b/llvm/include/llvm/Target/GlobalISel/Combine.td @@ -1849,6 +1849,12 @@ def anyext_of_anyext : ext_of_ext_opcodes; def anyext_of_zext : ext_of_ext_opcodes; def anyext_of_sext : ext_of_ext_opcodes; +def sext_inreg_of_sext_inreg : GICombineRule< + (defs root:$dst), + (match (G_SEXT_INREG $x, $src, $a):$other, + (G_SEXT_INREG $dst, $x, $b):$root), + (apply [{ Helper.applyRedundantSextInReg(*${root}, *${other}); }])>; + // Push cast through build vector. class buildvector_of_opcode : GICombineRule < (defs root:$root, build_fn_matchinfo:$matchinfo), @@ -1896,7 +1902,8 @@ def cast_of_cast_combines: GICombineGroup<[ sext_of_anyext, anyext_of_anyext, anyext_of_zext, - anyext_of_sext + anyext_of_sext, + sext_inreg_of_sext_inreg, ]>; def cast_combines: GICombineGroup<[ diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp index 576fd5fd81703..883a62c308232 100644 --- a/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp +++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp @@ -378,3 +378,30 @@ bool CombinerHelper::matchCastOfInteger(const MachineInstr &CastMI, return false; } } + +void CombinerHelper::applyRedundantSextInReg(MachineInstr &Root, + MachineInstr &Other) const { + assert(Root.getOpcode() == TargetOpcode::G_SEXT_INREG && + Other.getOpcode() == TargetOpcode::G_SEXT_INREG); + + unsigned RootWidth = Root.getOperand(2).getImm(); + unsigned OtherWidth = Other.getOperand(2).getImm(); + + Register Dst = Root.getOperand(0).getReg(); + Register OtherDst = Other.getOperand(0).getReg(); + Register Src = Other.getOperand(1).getReg(); + + if (RootWidth >= OtherWidth) { +// The root sext_inreg is entirely redundant because the other one +// is narrower. +Observer.changingAllUsesOfReg(MRI, Dst); +MRI.replaceRegWith(Dst, OtherDst); +Observer.finishedChangingAllUsesOfReg(); + } else { +// RootWidth < OtherWidth, rewrite this G_SEXT_INREG with the source of the +// other G_SEXT_INREG. +Builder.buildSExtInReg(Dst, Src, RootWidth); + } + + Root.eraseFromParent(); +} diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir new file mode 100644 index 0..566ee8e6c338d --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir @@ -0,0 +1,164 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 -run-pass=amdgpu-regbank-combiner -verify-machineinstrs %s -o - | FileCheck %s + +--- +name: inreg8_inreg16 +tracksRegLiveness: true +body: | + bb.0: +liveins: $vgpr0 +; CHECK-LABEL: name: inreg8_inreg16 +; CHECK: liveins: $vgpr0 +; CHECK-NEXT: {{ $}} +; CHECK-NEXT: %copy:_(s32) = COPY $vgpr0 +; CHECK-NEXT: %inreg:_(s32) = G_SEXT_INREG %copy, 8 +; CHECK-NEXT: $vgpr0 = COPY %inreg(s32) +%copy:_(s32) = COPY $vgpr0 +%inreg:_(s32) = G_SEXT_INREG %copy, 8 +%inreg1:_(s32) = G_SEXT_INREG %inreg, 16 +$vgpr0 = COPY %inreg1 +... + +
[llvm-branch-commits] [clang] [Driver] Add option to force undefined symbols during linking in BareMetal toolchain object. (PR #132807)
@@ -0,0 +1,15 @@ +// Check the arguments are correctly passed quic-garvgupt wrote: baremetal-ld.c is for testing LTO related tests so not clobbering it. Have renamed the tests as baremetal-undefined-symbols.c https://github.com/llvm/llvm-project/pull/132807 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxxabi] [release/18.x][backport][libc++abi] Use __has_feature check to enable usage of thread_local for exception storage (PR #132241)
ldionne wrote: The LLVM 18 release has been done for a long time, we're working on LLVM 20 now. https://github.com/llvm/llvm-project/pull/132241 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff ee0ee253d617aa4cddfe5216f93365645579b54d 3bdbe711b5f937d564e1883ec94e1c5ecbd87750 --extensions ,cpp,h -- clang/test/CodeGen/pfp-attribute-disable.cpp clang/test/CodeGen/pfp-load-store.cpp clang/test/CodeGen/pfp-memcpy.cpp clang/test/CodeGen/pfp-null-init.cpp clang/test/CodeGen/pfp-struct-gep.cpp clang/include/clang/AST/ASTContext.h clang/include/clang/Basic/LangOptions.h clang/lib/AST/ASTContext.cpp clang/lib/AST/ExprConstant.cpp clang/lib/AST/Type.cpp clang/lib/AST/TypePrinter.cpp clang/lib/CodeGen/CGCall.cpp clang/lib/CodeGen/CGClass.cpp clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CGExprAgg.cpp clang/lib/CodeGen/CGExprCXX.cpp clang/lib/CodeGen/CGExprConstant.cpp clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/CodeGen/CodeGenModule.cpp clang/lib/CodeGen/CodeGenModule.h clang/lib/CodeGen/ItaniumCXXABI.cpp clang/lib/CodeGen/MicrosoftCXXABI.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Sema/SemaDeclAttr.cpp clang/lib/Sema/SemaExprCXX.cpp clang/test/CodeGenCXX/trivial_abi.cpp libcxx/include/__config libcxx/include/__functional/function.h libcxx/include/__memory/shared_ptr.h libcxx/include/__memory/unique_ptr.h libcxx/include/__tree libcxx/include/__type_traits/is_trivially_relocatable.h libcxx/include/__vector/vector.h libcxx/include/typeinfo libcxx/test/libcxx/gdb/gdb_pretty_printer_test.sh.cpp libcxxabi/include/__cxxabi_config.h libcxxabi/src/private_typeinfo.h llvm/include/llvm/Analysis/PtrUseVisitor.h llvm/include/llvm/Transforms/Utils/Local.h llvm/lib/Analysis/PtrUseVisitor.cpp llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp llvm/lib/Transforms/Scalar/SROA.cpp llvm/lib/Transforms/Utils/SimplifyCFG.cpp `` View the diff from clang-format here. ``diff diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 9d824231d0..b4ac45a920 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -1299,8 +1299,7 @@ static llvm::Value *CoerceIntOrPtrToIntOrPtr(llvm::Value *Val, /// destination type; in this situation the values of bits which not /// present in the src are undefined. static llvm::Value *CreateCoercedLoad(Address Src, QualType SrcFETy, - llvm::Type *Ty, - CodeGenFunction &CGF) { + llvm::Type *Ty, CodeGenFunction &CGF) { llvm::Type *SrcTy = Src.getElementType(); // If SrcTy and Ty are the same, just do a load. @@ -1344,7 +1343,8 @@ static llvm::Value *CreateCoercedLoad(Address Src, QualType SrcFETy, CharUnits Offset = CharUnits::Zero(); llvm::Value *Val = llvm::UndefValue::get(AT); for (unsigned i = 0; i != AT->getNumElements(); ++i, Offset += wordSize) -Val = CGF.Builder.CreateInsertValue(Val, LoadCoercedField(Offset, ET), i); +Val = +CGF.Builder.CreateInsertValue(Val, LoadCoercedField(Offset, ET), i); return Val; } auto *ST = cast(Ty); @@ -1426,10 +1426,8 @@ static llvm::Value *CreateCoercedLoad(Address Src, QualType SrcFETy, return CGF.Builder.CreateLoad(Tmp); } -void CodeGenFunction::CreateCoercedStore(llvm::Value *Src, - QualType SrcFETy, - Address Dst, - llvm::TypeSize DstSize, +void CodeGenFunction::CreateCoercedStore(llvm::Value *Src, QualType SrcFETy, + Address Dst, llvm::TypeSize DstSize, bool DstIsVolatile) { if (!DstSize) return; @@ -4119,8 +4117,7 @@ void CodeGenFunction::EmitFunctionEpilog(const CGFunctionInfo &FI, auto eltAddr = Builder.CreateStructGEP(addr, i); llvm::Value *elt = CreateCoercedLoad( - eltAddr, - RetTy, + eltAddr, RetTy, unpaddedStruct ? unpaddedStruct->getElementType(unpaddedIndex++) : unpaddedCoercionType, *this); @@ -5711,8 +5708,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, if (ABIArgInfo::isPaddingForCoerceAndExpand(eltType)) continue; Address eltAddr = Builder.CreateStructGEP(addr, i); llvm::Value *elt = CreateCoercedLoad( -eltAddr, -I->Ty, +eltAddr, I->Ty, unpaddedStruct ? unpaddedStruct->getElementType(unpaddedIndex++) : unpaddedCoercionType, *this); diff --git a/clang/lib/CodeGen/CGClass.cpp b/clang/lib/CodeGen/CGClass.cpp index ae1d78baed..9d3784cf63 100644 --- a/clang/lib/CodeGen/CGClass.cpp +++ b/clang/lib/CodeGen/CGClass.cpp @@ -672,7 +67
[llvm-branch-commits] [llvm] release/20.x: [X86][AVX10.2] Include changes for COMX and VGETEXP from rev. 2 (#132824) (PR #132932)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/132932 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] [lld][LoongArch] Convert TLS IE to LE in the normal or medium code model (PR #123680)
https://github.com/ylzsx updated https://github.com/llvm/llvm-project/pull/123680 >From 0f580567169ffbf1546a5389ab4b9f7d1fc07c71 Mon Sep 17 00:00:00 2001 From: yangzhaoxin Date: Thu, 2 Jan 2025 20:58:56 +0800 Subject: [PATCH 1/6] Convert TLS IE to LE in the normal or medium code model. Original code sequence: * pcalau12i $a0, %ie_pc_hi20(sym) * ld.d $a0, $a0, %ie_pc_lo12(sym) The code sequence converted is as follows: * lu12i.w $a0, %ie_pc_hi20(sym) # le_hi20 != 0, otherwise NOP * ori $a0 $a0, %ie_pc_lo12(sym) FIXME: When relaxation enables, redundant NOP can be removed. This will be implemented in a future patch. Note: In the normal or medium code model, original code sequence with relocations can appear interleaved, because converted code sequence calculates the absolute offset. However, in extreme code model, to identify the current code model, the first four instructions with relocations must appear consecutively. --- lld/ELF/Arch/LoongArch.cpp | 87 ++ lld/ELF/Relocations.cpp| 15 ++- 2 files changed, 101 insertions(+), 1 deletion(-) diff --git a/lld/ELF/Arch/LoongArch.cpp b/lld/ELF/Arch/LoongArch.cpp index 4edc625b05cb0..f9a22a7bd5218 100644 --- a/lld/ELF/Arch/LoongArch.cpp +++ b/lld/ELF/Arch/LoongArch.cpp @@ -39,7 +39,11 @@ class LoongArch final : public TargetInfo { void relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const override; bool relaxOnce(int pass) const override; + void relocateAlloc(InputSectionBase &sec, uint8_t *buf) const override; void finalizeRelax(int passes) const override; + +private: + void tlsIeToLe(uint8_t *loc, const Relocation &rel, uint64_t val) const; }; } // end anonymous namespace @@ -53,6 +57,8 @@ enum Op { ADDI_W = 0x0280, ADDI_D = 0x02c0, ANDI = 0x0340, + ORI = 0x0380, + LU12I_W = 0x1400, PCADDI = 0x1800, PCADDU12I = 0x1c00, LD_W = 0x2880, @@ -1002,6 +1008,87 @@ static bool relax(Ctx &ctx, InputSection &sec) { return changed; } +// Convert TLS IE to LE in the normal or medium code model. +// Original code sequence: +// * pcalau12i $a0, %ie_pc_hi20(sym) +// * ld.d $a0, $a0, %ie_pc_lo12(sym) +// +// The code sequence converted is as follows: +// * lu12i.w $a0, %le_hi20(sym) # le_hi20 != 0, otherwise NOP +// * ori $a0 $a0, %le_lo12(sym) +// +// When relaxation enables, redundant NOPs can be removed. +void LoongArch::tlsIeToLe(uint8_t *loc, const Relocation &rel, + uint64_t val) const { + assert(isInt<32>(val) && + "val exceeds the range of medium code model in tlsIeToLe"); + + bool isUInt12 = isUInt<12>(val); + const uint32_t currInsn = read32le(loc); + switch (rel.type) { + case R_LARCH_TLS_IE_PC_HI20: +if (isUInt12) + write32le(loc, insn(ANDI, R_ZERO, R_ZERO, 0)); // nop +else + write32le(loc, insn(LU12I_W, getD5(currInsn), extractBits(val, 31, 12), + 0)); // lu12i.w $a0, %le_hi20 +break; + case R_LARCH_TLS_IE_PC_LO12: +if (isUInt12) + write32le(loc, insn(ORI, getD5(currInsn), R_ZERO, + val)); // ori $a0, $r0, %le_lo12 +else + write32le(loc, insn(ORI, getD5(currInsn), getJ5(currInsn), + lo12(val))); // ori $a0, $a0, %le_lo12 +break; + } +} + +void LoongArch::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const { + const unsigned bits = ctx.arg.is64 ? 64 : 32; + uint64_t secAddr = sec.getOutputSection()->addr; + if (auto *s = dyn_cast(&sec)) +secAddr += s->outSecOff; + else if (auto *ehIn = dyn_cast(&sec)) +secAddr += ehIn->getParent()->outSecOff; + bool isExtreme = false; + const MutableArrayRef relocs = sec.relocs(); + for (size_t i = 0, size = relocs.size(); i != size; ++i) { +Relocation &rel = relocs[i]; +uint8_t *loc = buf + rel.offset; +uint64_t val = SignExtend64( +sec.getRelocTargetVA(ctx, rel, secAddr + rel.offset), bits); + +switch (rel.expr) { +case R_RELAX_HINT: + continue; +case R_RELAX_TLS_IE_TO_LE: + if (rel.type == R_LARCH_TLS_IE_PC_HI20) { +// LoongArch does not support IE to LE optimize in the extreme code +// model. In this case, the relocs are as follows: +// +// * i -- R_LARCH_TLS_IE_PC_HI20 +// * i+1 -- R_LARCH_TLS_IE_PC_LO12 +// * i+2 -- R_LARCH_TLS_IE64_PC_LO20 +// * i+3 -- R_LARCH_TLS_IE64_PC_HI12 +isExtreme = +(i + 2 < size && relocs[i + 2].type == R_LARCH_TLS_IE64_PC_LO20); + } + if (isExtreme) { +rel.expr = getRelExpr(rel.type, *rel.sym, loc); +val = SignExtend64(sec.getRelocTargetVA(ctx, rel, secAddr + rel.offset), + bits); +relocateNoSym(loc, rel.type, val); + } else +tlsIeToLe(loc, rel, val); + continue; +default: + break; +} +relocate(loc, rel, val); + } +} + // Wh
[llvm-branch-commits] [llvm] llvm-reduce: Reduce global variable code model (PR #133865)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/133865?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#133865** https://app.graphite.dev/github/pr/llvm/llvm-project/133865?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/133865?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#133859** https://app.graphite.dev/github/pr/llvm/llvm-project/133859?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/133865 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 244daaf - Revert "[GlobalOpt] Handle operators separately when removing GV users (#84694)"
Author: Eli Friedman Date: 2025-03-25T11:17:45-07:00 New Revision: 244daaf51c11c04971f6adb144bbfacf4074b1a8 URL: https://github.com/llvm/llvm-project/commit/244daaf51c11c04971f6adb144bbfacf4074b1a8 DIFF: https://github.com/llvm/llvm-project/commit/244daaf51c11c04971f6adb144bbfacf4074b1a8.diff LOG: Revert "[GlobalOpt] Handle operators separately when removing GV users (#84694)" This reverts commit 51dad714e82e3e15c339aade8be605ed09bbabab. Added: Modified: llvm/lib/Transforms/IPO/GlobalOpt.cpp llvm/test/Transforms/GlobalOpt/cleanup-pointer-root-users-gep-constexpr.ll llvm/test/Transforms/GlobalOpt/dead-store-status.ll llvm/test/Transforms/GlobalOpt/pr54572.ll Removed: diff --git a/llvm/lib/Transforms/IPO/GlobalOpt.cpp b/llvm/lib/Transforms/IPO/GlobalOpt.cpp index 7b7b3802d7a77..2d046f09f1b2b 100644 --- a/llvm/lib/Transforms/IPO/GlobalOpt.cpp +++ b/llvm/lib/Transforms/IPO/GlobalOpt.cpp @@ -114,6 +114,55 @@ static cl::opt ColdCCRelFreq( "entry frequency, for a call site to be considered cold for enabling " "coldcc")); +/// Is this global variable possibly used by a leak checker as a root? If so, +/// we might not really want to eliminate the stores to it. +static bool isLeakCheckerRoot(GlobalVariable *GV) { + // A global variable is a root if it is a pointer, or could plausibly contain + // a pointer. There are two challenges; one is that we could have a struct + // the has an inner member which is a pointer. We recurse through the type to + // detect these (up to a point). The other is that we may actually be a union + // of a pointer and another type, and so our LLVM type is an integer which + // gets converted into a pointer, or our type is an [i8 x #] with a pointer + // potentially contained here. + + if (GV->hasPrivateLinkage()) +return false; + + SmallVector Types; + Types.push_back(GV->getValueType()); + + unsigned Limit = 20; + do { +Type *Ty = Types.pop_back_val(); +switch (Ty->getTypeID()) { + default: break; + case Type::PointerTyID: +return true; + case Type::FixedVectorTyID: + case Type::ScalableVectorTyID: +if (cast(Ty)->getElementType()->isPointerTy()) + return true; +break; + case Type::ArrayTyID: +Types.push_back(cast(Ty)->getElementType()); +break; + case Type::StructTyID: { +StructType *STy = cast(Ty); +if (STy->isOpaque()) return true; +for (Type *InnerTy : STy->elements()) { + if (isa(InnerTy)) return true; + if (isa(InnerTy) || isa(InnerTy) || + isa(InnerTy)) +Types.push_back(InnerTy); +} +break; + } +} +if (--Limit == 0) return true; + } while (!Types.empty()); + return false; +} + /// Given a value that is stored to a global but never read, determine whether /// it's safe to remove the store and the chain of computation that feeds the /// store. @@ -122,7 +171,7 @@ static bool IsSafeComputationToRemove( do { if (isa(V)) return true; -if (V->hasNUsesOrMore(1)) +if (!V->hasOneUse()) return false; if (isa(V) || isa(V) || isa(V) || isa(V)) @@ -144,12 +193,90 @@ static bool IsSafeComputationToRemove( } while (true); } +/// This GV is a pointer root. Loop over all users of the global and clean up +/// any that obviously don't assign the global a value that isn't dynamically +/// allocated. +static bool +CleanupPointerRootUsers(GlobalVariable *GV, +function_ref GetTLI) { + // A brief explanation of leak checkers. The goal is to find bugs where + // pointers are forgotten, causing an accumulating growth in memory + // usage over time. The common strategy for leak checkers is to explicitly + // allow the memory pointed to by globals at exit. This is popular because it + // also solves another problem where the main thread of a C++ program may shut + // down before other threads that are still expecting to use those globals. To + // handle that case, we expect the program may create a singleton and never + // destroy it. + + bool Changed = false; + + // If Dead[n].first is the only use of a malloc result, we can delete its + // chain of computation and the store to the global in Dead[n].second. + SmallVector, 32> Dead; + + SmallVector Worklist(GV->users()); + // Constants can't be pointers to dynamically allocated memory. + while (!Worklist.empty()) { +User *U = Worklist.pop_back_val(); +if (StoreInst *SI = dyn_cast(U)) { + Value *V = SI->getValueOperand(); + if (isa(V)) { +Changed = true; +SI->eraseFromParent(); + } else if (Instruction *I = dyn_cast(V)) { +if (I->hasOneUse()) + Dead.push_back(std::make_pair(I, SI)); + } +} else if (MemSetInst *MSI = dyn_cast(U)) { + if (isa(MSI->getV
[llvm-branch-commits] [llvm] llvm-reduce: Defer a shouldKeep call in operand reduction (PR #133387)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/133387 >From fa597dd4161693813a3566fd1d4a3c7df1d00746 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 28 Mar 2025 12:58:20 +0700 Subject: [PATCH] llvm-reduce: Defer a shouldKeep call in operand reduction Ideally shouldKeep is only called in contexts that will successfully do something. --- llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp b/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp index b0bca015434fa..8b6446725b7d4 100644 --- a/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp +++ b/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp @@ -26,8 +26,8 @@ extractOperandsFromModule(Oracle &O, ReducerWorkItem &WorkItem, for (auto &I : instructions(&F)) { if (PHINode *Phi = dyn_cast(&I)) { for (auto &Op : Phi->incoming_values()) { - if (!O.shouldKeep()) { -if (Value *Reduced = ReduceValue(Op)) + if (Value *Reduced = ReduceValue(Op)) { +if (!O.shouldKeep()) Phi->setIncomingValueForBlock(Phi->getIncomingBlock(Op), Reduced); } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms after duplication for threading (PR #133484)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: Orlando Cazalet-Hyams (OCHyams) Changes Given the same branch condition in `a` and `c` SimplifyCFG converts: +> b -+ | v --> a --> c --> e --> | ^ +> d -+ into: +--> bcd ---+ | v --> a --> c --> e --> Remap source atoms on instructions duplicated from `c` into `bcd`. --- Full diff: https://github.com/llvm/llvm-project/pull/133484.diff 2 Files Affected: - (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+6-6) - (added) llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll (+62) ``diff diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp index 1ba1e4ac81000..c83ff0260e297 100644 --- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp +++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp @@ -3589,7 +3589,7 @@ foldCondBranchOnValueKnownInPredecessorImpl(BranchInst *BI, DomTreeUpdater *DTU, // instructions into EdgeBB. We know that there will be no uses of the // cloned instructions outside of EdgeBB. BasicBlock::iterator InsertPt = EdgeBB->getFirstInsertionPt(); -DenseMap TranslateMap; // Track translated values. +ValueToValueMapTy TranslateMap; // Track translated values. TranslateMap[Cond] = CB; // RemoveDIs: track instructions that we optimise away while folding, so @@ -3609,11 +3609,11 @@ foldCondBranchOnValueKnownInPredecessorImpl(BranchInst *BI, DomTreeUpdater *DTU, N->setName(BBI->getName() + ".c"); // Update operands due to translation. - for (Use &Op : N->operands()) { -DenseMap::iterator PI = TranslateMap.find(Op); -if (PI != TranslateMap.end()) - Op = PI->second; - } + // Key Instructions: Remap all the atom groups. + if (const DebugLoc &DL = BBI->getDebugLoc()) +mapAtomInstance(DL, TranslateMap); + RemapInstruction(N, TranslateMap, + RF_IgnoreMissingLocals | RF_NoModuleLevelChanges); // Check for trivial simplification. if (Value *V = simplifyInstruction(N, {DL, nullptr, nullptr, AC})) { diff --git a/llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll b/llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll new file mode 100644 index 0..f8477600c6418 --- /dev/null +++ b/llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll @@ -0,0 +1,62 @@ +; RUN: opt %s -passes=simplifycfg -simplifycfg-require-and-preserve-domtree=1 -S \ +; RUN: | FileCheck %s + +;; Generated using: +;; opt -passes=debugify --debugify-atoms --debugify-level=locations \ +;; llvm/test/Transforms/SimplifyCFG/debug-info-thread-phi.ll +;; With unused/untested metadata nodes removed. + +;; Check the duplicated store gets distinct atom info in each branch. + +; CHECK-LABEL: @bar( +; CHECK: if.then: +; CHECK: store i32 1{{.*}}, !dbg [[DBG1:!.*]] +; CHECK: if.end.1.critedge: +; CHECK: store i32 1{{.*}}, !dbg [[DBG2:!.*]] +; CHECK: [[DBG1]] = !DILocation(line: 1{{.*}}, atomGroup: 1 +; CHECK: [[DBG2]] = !DILocation(line: 1{{.*}}, atomGroup: 2 + +define void @bar(i32 %aa) !dbg !5 { +entry: + %aa.addr = alloca i32, align 4 + %bb = alloca i32, align 4 + store i32 %aa, ptr %aa.addr, align 4 + store i32 0, ptr %bb, align 4 + %tobool = icmp ne i32 %aa, 0 + br i1 %tobool, label %if.then, label %if.end + +if.then: ; preds = %entry + call void @foo() + br label %if.end + +if.end: ; preds = %if.then, %entry + store i32 1, ptr %bb, align 4, !dbg !8 + br i1 %tobool, label %if.then.1, label %if.end.1 + +if.then.1:; preds = %if.end + call void @foo() + br label %if.end.1 + +if.end.1: ; preds = %if.then.1, %if.end + store i32 2, ptr %bb, align 4 + br label %for.end + +for.end: ; preds = %if.end.1 + ret void +} + +declare void @foo() + +!llvm.dbg.cu = !{!0} +!llvm.debugify = !{!2, !3} +!llvm.module.flags = !{!4} + +!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug) +!1 = !DIFile(filename: "llvm/test/Transforms/SimplifyCFG/debug-info-thread-phi.ll", directory: "/") +!2 = !{i32 15} +!3 = !{i32 0} +!4 = !{i32 2, !"Debug Info Version", i32 3} +!5 = distinct !DISubprogram(name: "bar", linkageName: "bar", scope: null, file: !1, line: 1, type: !6, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0) +!6 = !DISubroutineType(types: !7) +!7 = !{} +!8 = !DILocation(line: 1, column: 1, scope: !5, atomGroup: 1, atomRank: 1) `` https://github.com/llvm/llvm-project/pull/133484 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https
[llvm-branch-commits] [lldb] release/20.x: [lldb] Respect LaunchInfo::SetExecutable in ProcessLauncherPosixFork (#133093) (PR #134079)
llvmbot wrote: @llvm/pr-subscribers-lldb Author: None (llvmbot) Changes Backport 39e7efe1e4304544289d8d1b45f4d04d11b4a791 Requested by: @DavidSpickett --- Full diff: https://github.com/llvm/llvm-project/pull/134079.diff 2 Files Affected: - (modified) lldb/source/Host/posix/ProcessLauncherPosixFork.cpp (+6-2) - (modified) lldb/unittests/Host/HostTest.cpp (+42-1) ``diff diff --git a/lldb/source/Host/posix/ProcessLauncherPosixFork.cpp b/lldb/source/Host/posix/ProcessLauncherPosixFork.cpp index 7d856954684c4..903b18b10976c 100644 --- a/lldb/source/Host/posix/ProcessLauncherPosixFork.cpp +++ b/lldb/source/Host/posix/ProcessLauncherPosixFork.cpp @@ -94,6 +94,7 @@ struct ForkLaunchInfo { bool debug; bool disable_aslr; std::string wd; + std::string executable; const char **argv; Environment::Envp envp; std::vector actions; @@ -194,7 +195,8 @@ struct ForkLaunchInfo { } // Execute. We should never return... - execve(info.argv[0], const_cast(info.argv), info.envp); + execve(info.executable.c_str(), const_cast(info.argv), + info.envp); #if defined(__linux__) if (errno == ETXTBSY) { @@ -207,7 +209,8 @@ struct ForkLaunchInfo { // Since this state should clear up quickly, wait a while and then give it // one more go. usleep(5); -execve(info.argv[0], const_cast(info.argv), info.envp); +execve(info.executable.c_str(), const_cast(info.argv), + info.envp); } #endif @@ -246,6 +249,7 @@ ForkLaunchInfo::ForkLaunchInfo(const ProcessLaunchInfo &info) debug(info.GetFlags().Test(eLaunchFlagDebug)), disable_aslr(info.GetFlags().Test(eLaunchFlagDisableASLR)), wd(info.GetWorkingDirectory().GetPath()), + executable(info.GetExecutableFile().GetPath()), argv(info.GetArguments().GetConstArgumentVector()), envp(FixupEnvironment(info.GetEnvironment())), actions(MakeForkActions(info)) {} diff --git a/lldb/unittests/Host/HostTest.cpp b/lldb/unittests/Host/HostTest.cpp index a1d8a3b7f485a..ed1df6de001ea 100644 --- a/lldb/unittests/Host/HostTest.cpp +++ b/lldb/unittests/Host/HostTest.cpp @@ -7,12 +7,24 @@ //===--===// #include "lldb/Host/Host.h" +#include "TestingSupport/SubsystemRAII.h" +#include "lldb/Host/FileSystem.h" +#include "lldb/Host/ProcessLaunchInfo.h" #include "lldb/Utility/ProcessInfo.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/FileSystem.h" +#include "llvm/Testing/Support/Error.h" #include "gtest/gtest.h" +#include using namespace lldb_private; using namespace llvm; +// From TestMain.cpp. +extern const char *TestMainArgv0; + +static cl::opt test_arg("test-arg"); + TEST(Host, WaitStatusFormat) { EXPECT_EQ("W01", formatv("{0:g}", WaitStatus{WaitStatus::Exit, 1}).str()); EXPECT_EQ("X02", formatv("{0:g}", WaitStatus{WaitStatus::Signal, 2}).str()); @@ -45,4 +57,33 @@ TEST(Host, ProcessInstanceInfoCumulativeSystemTimeIsValid) { EXPECT_TRUE(info.CumulativeSystemTimeIsValid()); info.SetCumulativeSystemTime(ProcessInstanceInfo::timespec{1, 0}); EXPECT_TRUE(info.CumulativeSystemTimeIsValid()); -} \ No newline at end of file +} + +TEST(Host, LaunchProcessSetsArgv0) { + SubsystemRAII subsystems; + + static constexpr StringLiteral TestArgv0 = "HelloArgv0"; + if (test_arg != 0) { +// In subprocess +if (TestMainArgv0 != TestArgv0) { + errs() << formatv("Got '{0}' for argv[0]\n", TestMainArgv0); + exit(1); +} +exit(0); + } + + ProcessLaunchInfo info; + info.SetExecutableFile( + FileSpec(llvm::sys::fs::getMainExecutable(TestMainArgv0, &test_arg)), + /*add_exe_file_as_first_arg=*/false); + info.GetArguments().AppendArgument("HelloArgv0"); + info.GetArguments().AppendArgument( + "--gtest_filter=Host.LaunchProcessSetsArgv0"); + info.GetArguments().AppendArgument("--test-arg=47"); + std::promise exit_status; + info.SetMonitorProcessCallback([&](lldb::pid_t pid, int signal, int status) { +exit_status.set_value(status); + }); + ASSERT_THAT_ERROR(Host::LaunchProcess(info).takeError(), Succeeded()); + ASSERT_THAT(exit_status.get_future().get(), 0); +} `` https://github.com/llvm/llvm-project/pull/134079 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxxabi] [release/18.x][backport][libc++abi] Use __has_feature check to enable usage of thread_local for exception storage (PR #132241)
llvmbot wrote: @llvm/pr-subscribers-libcxxabi Author: Bushev Dmitry (dybv-sc) Changes This is a backport of original commit #97591 to 18.x release. --- Full diff: https://github.com/llvm/llvm-project/pull/132241.diff 1 Files Affected: - (modified) libcxxabi/src/cxa_exception_storage.cpp (+1-1) ``diff diff --git a/libcxxabi/src/cxa_exception_storage.cpp b/libcxxabi/src/cxa_exception_storage.cpp index 3a3233a1b9272..83408c904e1f7 100644 --- a/libcxxabi/src/cxa_exception_storage.cpp +++ b/libcxxabi/src/cxa_exception_storage.cpp @@ -24,7 +24,7 @@ extern "C" { } // extern "C" } // namespace __cxxabiv1 -#elif defined(HAS_THREAD_LOCAL) +#elif __has_feature(cxx_thread_local) namespace __cxxabiv1 { namespace { `` https://github.com/llvm/llvm-project/pull/132241 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)
atrosinenko wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/131899?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#131899** https://app.graphite.dev/github/pr/llvm/llvm-project/131899?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/131899?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#131898** https://app.graphite.dev/github/pr/llvm/llvm-project/131898?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#131897** https://app.graphite.dev/github/pr/llvm/llvm-project/131897?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#131896** https://app.graphite.dev/github/pr/llvm/llvm-project/131896?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#131895** https://app.graphite.dev/github/pr/llvm/llvm-project/131895?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/131899 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [ctxprof][nfc] Make `computeImportForFunction` a member of `ModuleImportsManager` (PR #134011)
https://github.com/mtrofin ready_for_review https://github.com/llvm/llvm-project/pull/134011 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] llvm-reduce: Do not reduce alloca array sizes to 0 (PR #132864)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/132864 >From 8b7fcfc65d1615368805f5c3c5a459cc7e8c026a Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 25 Mar 2025 09:39:18 +0700 Subject: [PATCH] llvm-reduce: Do not reduce alloca array sizes to 0 Fixes #64340 --- .../llvm-reduce/reduce-operands-alloca.ll | 69 +++ .../llvm-reduce/deltas/ReduceOperands.cpp | 5 ++ 2 files changed, 74 insertions(+) create mode 100644 llvm/test/tools/llvm-reduce/reduce-operands-alloca.ll diff --git a/llvm/test/tools/llvm-reduce/reduce-operands-alloca.ll b/llvm/test/tools/llvm-reduce/reduce-operands-alloca.ll new file mode 100644 index 0..61c46185b3378 --- /dev/null +++ b/llvm/test/tools/llvm-reduce/reduce-operands-alloca.ll @@ -0,0 +1,69 @@ +; RUN: llvm-reduce --abort-on-invalid-reduction --delta-passes=operands-zero --test FileCheck --test-arg --check-prefix=CHECK --test-arg %s --test-arg --input-file %s -o %t +; RUN: FileCheck %s --check-prefixes=CHECK,ZERO < %t + +; RUN: llvm-reduce --abort-on-invalid-reduction --delta-passes=operands-one --test FileCheck --test-arg --check-prefix=CHECK --test-arg %s --test-arg --input-file %s -o %t +; RUN: FileCheck %s --check-prefixes=CHECK,ONE < %t + +; RUN: llvm-reduce --abort-on-invalid-reduction --delta-passes=operands-poison --test FileCheck --test-arg --check-prefix=CHECK --test-arg %s --test-arg --input-file %s -o %t +; RUN: FileCheck %s --check-prefixes=CHECK,POISON < %t + + +; CHECK-LABEL: @dyn_alloca( +; ZERO: %alloca = alloca i32, i32 %size, align 4 +; ONE: %alloca = alloca i32, align 4 +; POISON: %alloca = alloca i32, i32 %size, align 4 +define void @dyn_alloca(i32 %size) { + %alloca = alloca i32, i32 %size + store i32 0, ptr %alloca + ret void +} + +; CHECK-LABEL: @alloca_0_elt( +; ZERO: %alloca = alloca i32, i32 0, align 4 +; ONE: %alloca = alloca i32, i32 0, align 4 +; POISON: %alloca = alloca i32, i32 0, align 4 +define void @alloca_0_elt() { + %alloca = alloca i32, i32 0 + store i32 0, ptr %alloca + ret void +} + +; CHECK-LABEL: @alloca_1_elt( +; ZERO: %alloca = alloca i32, align 4 +; ONE: %alloca = alloca i32, align 4 +; POISON: %alloca = alloca i32, align 4 +define void @alloca_1_elt() { + %alloca = alloca i32, i32 1 + store i32 0, ptr %alloca + ret void +} + +; CHECK-LABEL: @alloca_1024_elt( +; ZERO: %alloca = alloca i32, i32 1024, align 4 +; ONE: %alloca = alloca i32, align 4 +; POISON: %alloca = alloca i32, i32 1024, align 4 +define void @alloca_1024_elt() { + %alloca = alloca i32, i32 1024 + store i32 0, ptr %alloca + ret void +} + +; CHECK-LABEL: @alloca_poison_elt( +; ZERO: %alloca = alloca i32, i32 poison, align 4 +; ONE: %alloca = alloca i32, align 4 +; POISON: %alloca = alloca i32, i32 poison, align 4 +define void @alloca_poison_elt() { + %alloca = alloca i32, i32 poison + store i32 0, ptr %alloca + ret void +} + +; CHECK-LABEL: @alloca_constexpr_elt( +; ZERO: %alloca = alloca i32, i32 ptrtoint (ptr @alloca_constexpr_elt to i32) +; ONE: %alloca = alloca i32, align 4 +; POISON: %alloca = alloca i32, i32 ptrtoint (ptr @alloca_constexpr_elt to i32) +define void @alloca_constexpr_elt() { + %alloca = alloca i32, i32 ptrtoint (ptr @alloca_constexpr_elt to i32) + store i32 0, ptr %alloca + ret void +} diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp b/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp index a4fdd9ce8033b..b0bca015434fa 100644 --- a/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp +++ b/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp @@ -125,6 +125,11 @@ void llvm::reduceOperandsZeroDeltaPass(Oracle &O, ReducerWorkItem &WorkItem) { auto ReduceValue = [](Use &Op) -> Value * { if (!shouldReduceOperand(Op)) return nullptr; + +// Avoid introducing 0-sized allocations. +if (isa(Op.getUser())) + return nullptr; + // Don't duplicate an existing switch case. if (auto *IntTy = dyn_cast(Op->getType())) if (switchCaseExists(Op, ConstantInt::get(IntTy, 0))) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [clang][docs] Move -Wnon-trivial-memcall to added flags. (PR #132367)
https://github.com/R-Goc edited https://github.com/llvm/llvm-project/pull/132367 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
https://github.com/OCHyams ready_for_review https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [HEXAGON] Add support to lower "FREEZE a half(f16)" instruction on Hexagon and fix the isel-buildvector-v2f16.ll assertion (#130977) (PR #132138)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/132138 Backport 9c65e6ac115a Requested by: @androm3da >From e09e2480046740628cc28055010aef1ab05e5aff Mon Sep 17 00:00:00 2001 From: Abinaya Saravanan Date: Thu, 13 Mar 2025 03:28:26 +0530 Subject: [PATCH] [HEXAGON] Add support to lower "FREEZE a half(f16)" instruction on Hexagon and fix the isel-buildvector-v2f16.ll assertion (#130977) (cherry picked from commit 9c65e6ac115a7d8566c874537791125c3ace7c1a) --- llvm/lib/Target/Hexagon/HexagonISelLowering.h | 1 + .../Target/Hexagon/HexagonISelLoweringHVX.cpp | 22 +- llvm/test/CodeGen/Hexagon/fp16-promote.ll | 44 +++ 3 files changed, 56 insertions(+), 11 deletions(-) create mode 100644 llvm/test/CodeGen/Hexagon/fp16-promote.ll diff --git a/llvm/lib/Target/Hexagon/HexagonISelLowering.h b/llvm/lib/Target/Hexagon/HexagonISelLowering.h index aaa9c65c1e07e..4df88b3a8abd7 100644 --- a/llvm/lib/Target/Hexagon/HexagonISelLowering.h +++ b/llvm/lib/Target/Hexagon/HexagonISelLowering.h @@ -362,6 +362,7 @@ class HexagonTargetLowering : public TargetLowering { shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const override { return AtomicExpansionKind::LLSC; } + bool softPromoteHalfType() const override { return true; } private: void initializeHVXLowering(); diff --git a/llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp b/llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp index 1a19e81a68f08..a7eb20a3e5ff9 100644 --- a/llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp +++ b/llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp @@ -1618,17 +1618,6 @@ HexagonTargetLowering::LowerHvxBuildVector(SDValue Op, SelectionDAG &DAG) for (unsigned i = 0; i != Size; ++i) Ops.push_back(Op.getOperand(i)); - // First, split the BUILD_VECTOR for vector pairs. We could generate - // some pairs directly (via splat), but splats should be generated - // by the combiner prior to getting here. - if (VecTy.getSizeInBits() == 16*Subtarget.getVectorLength()) { -ArrayRef A(Ops); -MVT SingleTy = typeSplit(VecTy).first; -SDValue V0 = buildHvxVectorReg(A.take_front(Size/2), dl, SingleTy, DAG); -SDValue V1 = buildHvxVectorReg(A.drop_front(Size/2), dl, SingleTy, DAG); -return DAG.getNode(ISD::CONCAT_VECTORS, dl, VecTy, V0, V1); - } - if (VecTy.getVectorElementType() == MVT::i1) return buildHvxVectorPred(Ops, dl, VecTy, DAG); @@ -1645,6 +1634,17 @@ HexagonTargetLowering::LowerHvxBuildVector(SDValue Op, SelectionDAG &DAG) return DAG.getBitcast(tyVector(VecTy, MVT::f16), T0); } + // First, split the BUILD_VECTOR for vector pairs. We could generate + // some pairs directly (via splat), but splats should be generated + // by the combiner prior to getting here. + if (VecTy.getSizeInBits() == 16 * Subtarget.getVectorLength()) { +ArrayRef A(Ops); +MVT SingleTy = typeSplit(VecTy).first; +SDValue V0 = buildHvxVectorReg(A.take_front(Size / 2), dl, SingleTy, DAG); +SDValue V1 = buildHvxVectorReg(A.drop_front(Size / 2), dl, SingleTy, DAG); +return DAG.getNode(ISD::CONCAT_VECTORS, dl, VecTy, V0, V1); + } + return buildHvxVectorReg(Ops, dl, VecTy, DAG); } diff --git a/llvm/test/CodeGen/Hexagon/fp16-promote.ll b/llvm/test/CodeGen/Hexagon/fp16-promote.ll new file mode 100644 index 0..1ef0a133ce30a --- /dev/null +++ b/llvm/test/CodeGen/Hexagon/fp16-promote.ll @@ -0,0 +1,44 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -march=hexagon < %s | FileCheck %s + +define half @freeze_half_undef() nounwind { +; CHECK-LABEL: freeze_half_undef: +; CHECK: // %bb.0: +; CHECK-NEXT:{ +; CHECK-NEXT: call __truncsfhf2 +; CHECK-NEXT: r0 = #0 +; CHECK-NEXT: allocframe(#0) +; CHECK-NEXT:} +; CHECK-NEXT:{ +; CHECK-NEXT: call __extendhfsf2 +; CHECK-NEXT:} +; CHECK-NEXT:{ +; CHECK-NEXT: call __truncsfhf2 +; CHECK-NEXT: r0 = sfadd(r0,r0) +; CHECK-NEXT:} +; CHECK-NEXT:{ +; CHECK-NEXT: r31:30 = dealloc_return(r30):raw +; CHECK-NEXT:} + %y1 = freeze half undef + %t1 = fadd half %y1, %y1 + ret half %t1 +} + +define half @freeze_half_poison(half %maybe.poison) { +; CHECK-LABEL: freeze_half_poison: +; CHECK: // %bb.0: +; CHECK:{ +; CHECK-NEXT: call __extendhfsf2 +; CHECK-NEXT: allocframe(r29,#0):raw +; CHECK-NEXT:} +; CHECK-NEXT:{ +; CHECK-NEXT: call __truncsfhf2 +; CHECK-NEXT: r0 = sfadd(r0,r0) +; CHECK-NEXT:} +; CHECK-NEXT:{ +; CHECK-NEXT: r31:30 = dealloc_return(r30):raw +; CHECK-NEXT:} + %y1 = freeze half %maybe.poison + %t1 = fadd half %y1, %y1 + ret half %t1 +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Prevent SI_CS_CHAIN instruction from giving registers classes in generic instructions (PR #131329)
rovka wrote: Reopening this (not sure if I can change the target branch) https://github.com/llvm/llvm-project/pull/131329 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/131899 >From efd98b412431b0c597d3d7dcee0dd4255b8e2418 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 18 Mar 2025 21:32:11 +0300 Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect non-protected indirect calls --- bolt/include/bolt/Core/MCPlusBuilder.h| 10 + bolt/lib/Passes/PAuthGadgetScanner.cpp| 33 +- .../Target/AArch64/AArch64MCPlusBuilder.cpp | 42 ++ .../binary-analysis/AArch64/gs-pauth-calls.s | 676 ++ 4 files changed, 757 insertions(+), 4 deletions(-) create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-calls.s diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 76ea2489e7038..b3d54ccd5955d 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -577,6 +577,16 @@ class MCPlusBuilder { return getNoRegister(); } + /// Returns the register used as call destination, or no-register, if not + /// an indirect call. Sets IsAuthenticatedInternally if the instruction + /// accepts signed pointer as its operand and authenticates it internally. + virtual MCPhysReg + getRegUsedAsCallDest(const MCInst &Inst, + bool &IsAuthenticatedInternally) const { +llvm_unreachable("not implemented"); +return getNoRegister(); + } + virtual bool isTerminator(const MCInst &Inst) const; virtual bool isNoop(const MCInst &Inst) const { diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index c81a586b02771..b8a0a80215ce2 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -382,11 +382,11 @@ class PacRetAnalysis public: std::vector - getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF, - const ArrayRef UsedDirtyRegs) const { + getLastClobberingInsts(const MCInst &Inst, BinaryFunction &BF, + const ArrayRef UsedDirtyRegs) { if (RegsToTrackInstsFor.empty()) return {}; -auto MaybeState = getStateAt(Ret); +auto MaybeState = getStateBefore(Inst); if (!MaybeState) llvm_unreachable("Expected State to be present"); const State &S = *MaybeState; @@ -434,6 +434,29 @@ static std::shared_ptr tryCheckReturn(const BinaryContext &BC, return std::make_shared(RetKind, Inst, RetReg); } +static std::shared_ptr tryCheckCall(const BinaryContext &BC, +const MCInstReference &Inst, +const State &S) { + static const GadgetKind CallKind("non-protected call found"); + if (!BC.MIB->isCall(Inst) && !BC.MIB->isBranch(Inst)) +return nullptr; + + bool IsAuthenticated = false; + MCPhysReg DestReg = BC.MIB->getRegUsedAsCallDest(Inst, IsAuthenticated); + if (IsAuthenticated || DestReg == BC.MIB->getNoRegister()) +return nullptr; + + LLVM_DEBUG({ +traceInst(BC, "Found call inst", Inst); +traceReg(BC, "Call destination reg", DestReg); +traceRegMask(BC, "SafeToDerefRegs", S.SafeToDerefRegs); + }); + if (S.SafeToDerefRegs[DestReg]) +return nullptr; + + return std::make_shared(CallKind, Inst, DestReg); +} + FunctionAnalysisResult Analysis::computeDfState(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) { @@ -450,10 +473,12 @@ Analysis::computeDfState(BinaryFunction &BF, for (BinaryBasicBlock &BB : BF) { for (int64_t I = 0, E = BB.size(); I < E; ++I) { MCInstReference Inst(&BB, I); - const State &S = *PRA.getStateAt(Inst); + const State &S = *PRA.getStateBefore(Inst); if (auto Report = tryCheckReturn(BC, Inst, S)) Result.Diagnostics.push_back(Report); + if (auto Report = tryCheckCall(BC, Inst, S)) +Result.Diagnostics.push_back(Report); } } diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp index d238a1df5c7d7..9ce1514639f95 100644 --- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp +++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp @@ -277,6 +277,48 @@ class AArch64MCPlusBuilder : public MCPlusBuilder { } } + MCPhysReg + getRegUsedAsCallDest(const MCInst &Inst, + bool &IsAuthenticatedInternally) const override { +assert(isCall(Inst) || isBranch(Inst)); +IsAuthenticatedInternally = false; + +switch (Inst.getOpcode()) { +case AArch64::B: +case AArch64::BL: + assert(Inst.getOperand(0).isExpr()); + return getNoRegister(); +case AArch64::Bcc: +case AArch64::CBNZW: +case AArch64::CBNZX: +case AArch64::CBZW: +case AArch64::CBZX: + assert(Inst.getOperand(1).isExpr()); + return getNoRegister(); +case AArch64::TBNZW: +case AArch64::TBNZX: +case AArch64::TBZW: +case AArch64::TBZX: + assert(Ins
[llvm-branch-commits] [llvm] release/20.x: [LoongArch] Move fix-tle-le-sym-type test to test/MC. NFC (#133839) (PR #134014)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/134014 Backport 46968310cb837e4b32859edef2107080b828b117 Requested by: @zhaoqi5 >From f198a45b6947a65774a03069d73e3f236ef94ae0 Mon Sep 17 00:00:00 2001 From: ZhaoQi Date: Wed, 2 Apr 2025 09:11:20 +0800 Subject: [PATCH] [LoongArch] Move fix-tle-le-sym-type test to test/MC. NFC (#133839) (cherry picked from commit 46968310cb837e4b32859edef2107080b828b117) --- .../CodeGen/LoongArch/fix-tle-le-sym-type.ll | 24 - .../Relocations/relocation-specifier.s| 26 +++ 2 files changed, 26 insertions(+), 24 deletions(-) delete mode 100644 llvm/test/CodeGen/LoongArch/fix-tle-le-sym-type.ll create mode 100644 llvm/test/MC/LoongArch/Relocations/relocation-specifier.s diff --git a/llvm/test/CodeGen/LoongArch/fix-tle-le-sym-type.ll b/llvm/test/CodeGen/LoongArch/fix-tle-le-sym-type.ll deleted file mode 100644 index d39454a51a445..0 --- a/llvm/test/CodeGen/LoongArch/fix-tle-le-sym-type.ll +++ /dev/null @@ -1,24 +0,0 @@ -; RUN: llc --mtriple=loongarch32 --filetype=obj %s -o %t-la32 -; RUN: llvm-readelf -s %t-la32 | FileCheck %s --check-prefix=LA32 - -; RUN: llc --mtriple=loongarch64 --filetype=obj %s -o %t-la64 -; RUN: llvm-readelf -s %t-la64 | FileCheck %s --check-prefix=LA64 - -; LA32: Symbol table '.symtab' contains [[#]] entries: -; LA32-NEXT:Num:Value Size Type Bind Vis Ndx Name -; LA32: 0 TLS GLOBAL DEFAULT UND tls_sym - -; LA64: Symbol table '.symtab' contains [[#]] entries: -; LA64-NEXT:Num:Value Size Type Bind Vis Ndx Name -; LA64: 0 TLS GLOBAL DEFAULT UND tls_sym - -@tls_sym = external thread_local(localexec) global i32 - -define dso_local signext i32 @test_tlsle() nounwind { -entry: - %0 = call ptr @llvm.threadlocal.address.p0(ptr @tls_sym) - %1 = load i32, ptr %0 - ret i32 %1 -} - -declare nonnull ptr @llvm.threadlocal.address.p0(ptr nonnull) diff --git a/llvm/test/MC/LoongArch/Relocations/relocation-specifier.s b/llvm/test/MC/LoongArch/Relocations/relocation-specifier.s new file mode 100644 index 0..d0898aaab92fe --- /dev/null +++ b/llvm/test/MC/LoongArch/Relocations/relocation-specifier.s @@ -0,0 +1,26 @@ +# RUN: llvm-mc --filetype=obj --triple=loongarch32 %s -o %t-la32 +# RUN: llvm-readelf -rs %t-la32 | FileCheck %s --check-prefixes=CHECK,RELOC32 +# RUN: llvm-mc --filetype=obj --triple=loongarch64 %s -o %t-la64 +# RUN: llvm-readelf -rs %t-la64 | FileCheck %s --check-prefixes=CHECK,RELOC64 + +## This test is similar to test/MC/CSKY/relocation-specifier.s. + +# RELOC32: '.rela.data' +# RELOC32: R_LARCH_32 .data + 0 + +# RELOC64: '.rela.data' +# RELOC64: R_LARCH_32 .data + 0 + +# CHECK: TLS GLOBAL DEFAULT UND gd +# CHECK: TLS GLOBAL DEFAULT UND ld +# CHECK: TLS GLOBAL DEFAULT UND ie +# CHECK: TLS GLOBAL DEFAULT UND le + +pcalau12i $t1, %gd_pc_hi20(gd) +pcalau12i $t1, %ld_pc_hi20(ld) +pcalau12i $t1, %ie_pc_hi20(ie) +lu12i.w $t1, %le_hi20_r(le) + +.data +local: +.long local ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: reformulate the state for data-flow analysis (PR #131898)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/131898 >From 1b82382369a66bee4345489ce8fc70abf04215a7 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 17 Mar 2025 22:27:53 +0300 Subject: [PATCH 1/2] [BOLT] Gadget scanner: reformulate the state for data-flow analysis In preparation for implementing support for detection of non-protected call instructions, refine the definition of state which is computed for each register by data-flow analysis. Explicitly marking the registers which are known to be trusted at function entry is crucial for finding non-protected calls. In addition, it fixes less-common false negatives for pac-ret, such as `ret x1` in `f_nonx30_ret_non_auted` test case. --- bolt/include/bolt/Core/MCPlusBuilder.h| 10 ++ bolt/include/bolt/Passes/PAuthGadgetScanner.h | 7 +- bolt/lib/Passes/PAuthGadgetScanner.cpp| 129 +++--- .../Target/AArch64/AArch64MCPlusBuilder.cpp | 4 + .../AArch64/gs-pacret-autiasp.s | 19 ++- .../AArch64/gs-pacret-multi-bb.s | 3 +- 6 files changed, 104 insertions(+), 68 deletions(-) diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index b285138b77fe7..76ea2489e7038 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -551,6 +551,16 @@ class MCPlusBuilder { return Analysis->isReturn(Inst); } + /// Returns the registers that are trusted at function entry. + /// + /// Each register should be treated as if a successfully authenticated + /// pointer was written to it before entering the function (i.e. the + /// pointer is safe to jump to as well as to be signed). + virtual SmallVector getTrustedLiveInRegs() const { +llvm_unreachable("not implemented"); +return {}; + } + virtual ErrorOr getAuthenticatedReg(const MCInst &Inst) const { llvm_unreachable("not implemented"); return getNoRegister(); diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index f102f1080e2e8..404dde2901767 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -209,13 +209,12 @@ struct Report { struct GadgetReport : public Report { const GadgetKind &Kind; - SmallVector AffectedRegisters; + SmallVector AffectedRegisters; std::vector OverwritingInstrs; GadgetReport(const GadgetKind &Kind, MCInstReference Location, - const BitVector &AffectedRegisters) - : Report(Location), Kind(Kind), -AffectedRegisters(AffectedRegisters.set_bits()) {} + MCPhysReg AffectedRegister) + : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {} void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index 4f7be17327b49..c81a586b02771 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -126,18 +126,16 @@ class TrackedRegisters { // The security property that is checked is: // When a register is used as the address to jump to in a return instruction, -// that register must either: -// (a) never be changed within this function, i.e. have the same value as when -// the function started, or +// that register must be safe-to-dereference. It must either +// (a) be safe-to-dereference at function entry and never be changed within this +// function, i.e. have the same value as when the function started, or // (b) the last write to the register must be by an authentication instruction. // This property is checked by using dataflow analysis to keep track of which -// registers have been written (def-ed), since last authenticated. Those are -// exactly the registers containing values that should not be trusted (as they -// could have changed since the last time they were authenticated). For pac-ret, -// any return instruction using such a register is a gadget to be reported. For -// PAuthABI, probably at least any indirect control flow using such a register -// should be reported. +// registers have been written (def-ed), since last authenticated. For pac-ret, +// any return instruction using a register which is not safe-to-dereference is +// a gadget to be reported. For PAuthABI, probably at least any indirect control +// flow using such a register should be reported. // Furthermore, when producing a diagnostic for a found non-pac-ret protected // return, the analysis also lists the last instructions that wrote to the @@ -156,10 +154,29 @@ class TrackedRegisters { //in the gadgets to be reported. This information is used in the second run //to also track which instructions last wrote to those registers. +/// A state representing which registers are safe to use by an instruction +/// at a given program p
[llvm-branch-commits] [clang] [HLSL][NFC] Use method builder to create default resource constructor (PR #131384)
https://github.com/hekota edited https://github.com/llvm/llvm-project/pull/131384 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] resugar decltype of DeclRefExpr (PR #132447)
https://github.com/mizvekov updated https://github.com/llvm/llvm-project/pull/132447 >From 0a363317b9ac02a6a6e2f70e805223bdb135fed3 Mon Sep 17 00:00:00 2001 From: Matheus Izvekov Date: Fri, 14 Mar 2025 19:41:38 -0300 Subject: [PATCH] [clang] resugar decltype of DeclRefExpr This keeps around the resugared DeclType for DeclRefExpr, which is otherwise partially lost as the expression type removes top level references. This helps 'decltype' resugaring work without any loss of information. --- clang/include/clang/AST/Expr.h| 32 +++-- clang/include/clang/AST/Stmt.h| 2 + clang/include/clang/Sema/Sema.h | 20 +++--- clang/lib/AST/ASTImporter.cpp | 3 +- clang/lib/AST/Expr.cpp| 82 +++ clang/lib/CodeGen/CGExpr.cpp | 4 +- clang/lib/Sema/SemaChecking.cpp | 3 +- clang/lib/Sema/SemaDeclCXX.cpp| 19 +++--- clang/lib/Sema/SemaExpr.cpp | 81 +++--- clang/lib/Sema/SemaOpenMP.cpp | 11 ++- clang/lib/Sema/SemaOverload.cpp | 25 --- clang/lib/Sema/SemaSYCL.cpp | 2 +- clang/lib/Sema/SemaTemplate.cpp | 13 clang/lib/Sema/SemaType.cpp | 9 ++- clang/lib/Sema/TreeTransform.h| 5 +- clang/lib/Serialization/ASTReaderStmt.cpp | 8 ++- clang/lib/Serialization/ASTWriterStmt.cpp | 6 +- clang/test/Sema/Resugar/resugar-expr.cpp | 6 +- 18 files changed, 201 insertions(+), 130 deletions(-) diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h index 2ba787ac6df55..e92f6696027f9 100644 --- a/clang/include/clang/AST/Expr.h +++ b/clang/include/clang/AST/Expr.h @@ -1266,7 +1266,7 @@ class DeclRefExpr final : public Expr, private llvm::TrailingObjects { +TemplateArgumentLoc, QualType> { friend class ASTStmtReader; friend class ASTStmtWriter; friend TrailingObjects; @@ -1292,17 +1292,27 @@ class DeclRefExpr final return hasTemplateKWAndArgsInfo(); } + size_t numTrailingObjects(OverloadToken) const { +return getNumTemplateArgs(); + } + + size_t numTrailingObjects(OverloadToken) const { +return HasResugaredDeclType(); + } + /// Test whether there is a distinct FoundDecl attached to the end of /// this DRE. bool hasFoundDecl() const { return DeclRefExprBits.HasFoundDecl; } + static bool needsDeclTypeStorage(ValueDecl *VD, QualType DeclType); + DeclRefExpr(const ASTContext &Ctx, NestedNameSpecifierLoc QualifierLoc, SourceLocation TemplateKWLoc, ValueDecl *D, bool RefersToEnclosingVariableOrCapture, const DeclarationNameInfo &NameInfo, NamedDecl *FoundD, const TemplateArgumentListInfo *TemplateArgs, const TemplateArgumentList *ConvertedArgs, QualType T, - ExprValueKind VK, NonOdrUseReason NOUR); + ExprValueKind VK, QualType DeclType, NonOdrUseReason NOUR); /// Construct an empty declaration reference expression. explicit DeclRefExpr(EmptyShell Empty) : Expr(DeclRefExprClass, Empty) {} @@ -1318,7 +1328,8 @@ class DeclRefExpr final Create(const ASTContext &Context, NestedNameSpecifierLoc QualifierLoc, SourceLocation TemplateKWLoc, ValueDecl *D, bool RefersToEnclosingVariableOrCapture, SourceLocation NameLoc, - QualType T, ExprValueKind VK, NamedDecl *FoundD = nullptr, + QualType T, ExprValueKind VK, QualType DeclType = QualType(), + NamedDecl *FoundD = nullptr, const TemplateArgumentListInfo *TemplateArgs = nullptr, const TemplateArgumentList *ConvertedArgs = nullptr, NonOdrUseReason NOUR = NOUR_None); @@ -1328,7 +1339,7 @@ class DeclRefExpr final SourceLocation TemplateKWLoc, ValueDecl *D, bool RefersToEnclosingVariableOrCapture, const DeclarationNameInfo &NameInfo, QualType T, ExprValueKind VK, - NamedDecl *FoundD = nullptr, + QualType DeclType = QualType(), NamedDecl *FoundD = nullptr, const TemplateArgumentListInfo *TemplateArgs = nullptr, const TemplateArgumentList *ConvertedArgs = nullptr, NonOdrUseReason NOUR = NOUR_None); @@ -1337,11 +1348,22 @@ class DeclRefExpr final static DeclRefExpr *CreateEmpty(const ASTContext &Context, bool HasQualifier, bool HasFoundDecl, bool HasTemplateKWAndArgsInfo, - unsigned NumTemplateArgs); + unsigned NumTemplateArgs, + bool HasResugaredDeclType); ValueDecl *getDecl() { return D; } const ValueDecl *getDecl() const { return D; } void setDecl(ValueDecl *NewD); + void recomputeDependency(); + + bool HasResugaredDeclType() const { +return DeclRefExprBits.HasResugaredDeclType; + } + QualType getDeclTyp
[llvm-branch-commits] [clang] [Driver] Change linker job in Baremetal toolchain object accomodate GCCInstallation.(2/3) (PR #121830)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff e07a4cd4e0ff77f74b66695923bc998904c14746 f4af05b47bddc3a88309341d5ff79cc9178f78ec --extensions cpp,c,h -- clang/lib/Driver/ToolChains/BareMetal.cpp clang/lib/Driver/ToolChains/BareMetal.h clang/test/Driver/aarch64-toolchain-extra.c clang/test/Driver/aarch64-toolchain.c clang/test/Driver/arm-toolchain-extra.c clang/test/Driver/arm-toolchain.c clang/test/Driver/sanitizer-ld.c `` View the diff from clang-format here. ``diff diff --git a/clang/lib/Driver/ToolChains/BareMetal.h b/clang/lib/Driver/ToolChains/BareMetal.h index b4e556df11..87f173342d 100644 --- a/clang/lib/Driver/ToolChains/BareMetal.h +++ b/clang/lib/Driver/ToolChains/BareMetal.h @@ -36,7 +36,7 @@ protected: Tool *buildStaticLibTool() const override; public: - bool hasValidGCCInstallation() const {return GCCInstallation.isValid(); } + bool hasValidGCCInstallation() const { return GCCInstallation.isValid(); } bool isBareMetal() const override { return true; } bool isCrossCompiling() const override { return true; } bool HasNativeLLVMSupport() const override { return true; } `` https://github.com/llvm/llvm-project/pull/121830 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for extends and trunc (PR #132383)
https://github.com/petar-avramovic created https://github.com/llvm/llvm-project/pull/132383 Uniform S1: Truncs to uniform S1 and AnyExts from S1 are left as is as they are meant to be combined away. Uniform S1 ZExt and SExt are lowered using select. Divergent S1: Trunc of VGPR to VCC is lowered as compare. Extends of VCC are lowered using select. For remaining types: S32 to S64 ZExt and SExt are lowered using merge values, AnyExt and Trunc are again left as is to be combined away. Notably uniform S16 for SExt and Zext is not lowered to S32 and left as is for instruction select to deal with them. This is because there are patterns that check for S16 type. >From 2ac46e4545ecbc07d16a827a326c092a70ddc50d Mon Sep 17 00:00:00 2001 From: Petar Avramovic Date: Fri, 21 Mar 2025 12:41:39 +0100 Subject: [PATCH] AMDGPU/GlobalISel: add RegBankLegalize rules for extends and trunc Uniform S1: Truncs to uniform S1 and AnyExts from S1 are left as is as they are meant to be combined away. Uniform S1 ZExt and SExt are lowered using select. Divergent S1: Trunc of VGPR to VCC is lowered as compare. Extends of VCC are lowered using select. For remaining types: S32 to S64 ZExt and SExt are lowered using merge values, AnyExt and Trunc are again left as is to be combined away. Notably uniform S16 for SExt and Zext is not lowered to S32 and left as is for instruction select to deal with them. This is because there are patterns that check for S16 type. --- .../Target/AMDGPU/AMDGPURegBankLegalize.cpp | 3 +- .../AMDGPU/AMDGPURegBankLegalizeHelper.cpp| 51 +++-- .../AMDGPU/AMDGPURegBankLegalizeRules.cpp | 47 +++- .../AMDGPU/AMDGPURegBankLegalizeRules.h | 3 + .../GlobalISel/regbankselect-anyext.mir | 61 ++- .../AMDGPU/GlobalISel/regbankselect-sext.mir | 100 -- .../AMDGPU/GlobalISel/regbankselect-trunc.mir | 22 ++-- .../AMDGPU/GlobalISel/regbankselect-zext.mir | 89 ++-- 8 files changed, 262 insertions(+), 114 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp index d5a83903e2b13..44f1b5419abb9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp @@ -216,7 +216,8 @@ class AMDGPURegBankLegalizeCombiner { return; } -if (DstTy == S32 && TruncSrcTy == S16) { +if ((DstTy == S64 && TruncSrcTy == S32) || +(DstTy == S32 && TruncSrcTy == S16)) { B.buildAnyExt(Dst, TruncSrc); cleanUpAfterCombine(MI, Trunc); return; diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp index 5dbaa9488d668..7301cba9e8ed3 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp @@ -173,13 +173,23 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI, case Ext32To64: { const RegisterBank *RB = MRI.getRegBank(MI.getOperand(0).getReg()); MachineInstrBuilder Hi; - -if (MI.getOpcode() == AMDGPU::G_ZEXT) { +switch (MI.getOpcode()) { +case AMDGPU::G_ZEXT: { Hi = B.buildConstant({RB, S32}, 0); -} else { + break; +} +case AMDGPU::G_SEXT: { // Replicate sign bit from 32-bit extended part. auto ShiftAmt = B.buildConstant({RB, S32}, 31); Hi = B.buildAShr({RB, S32}, MI.getOperand(1).getReg(), ShiftAmt); + break; +} +case AMDGPU::G_ANYEXT: { + Hi = B.buildUndef({RB, S32}); + break; +} +default: + llvm_unreachable("Unsuported Opcode in Ext32To64"); } B.buildMergeLikeInstr(MI.getOperand(0).getReg(), @@ -202,7 +212,7 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI, // compares all bits in register. Register BoolSrc = MRI.createVirtualRegister({VgprRB, Ty}); if (Ty == S64) { - auto Src64 = B.buildUnmerge({VgprRB, Ty}, Src); + auto Src64 = B.buildUnmerge(VgprRB_S32, Src); auto One = B.buildConstant(VgprRB_S32, 1); auto AndLo = B.buildAnd(VgprRB_S32, Src64.getReg(0), One); auto Zero = B.buildConstant(VgprRB_S32, 0); @@ -396,8 +406,11 @@ LLT RegBankLegalizeHelper::getTyFromID(RegBankLLTMappingApplyID ID) { case Sgpr32AExt: case Sgpr32AExtBoolInReg: case Sgpr32SExt: + case Sgpr32ZExt: case UniInVgprS32: case Vgpr32: + case Vgpr32SExt: + case Vgpr32ZExt: return LLT::scalar(32); case Sgpr64: case Vgpr64: @@ -508,6 +521,7 @@ RegBankLegalizeHelper::getRegBankFromID(RegBankLLTMappingApplyID ID) { case Sgpr32AExt: case Sgpr32AExtBoolInReg: case Sgpr32SExt: + case Sgpr32ZExt: return SgprRB; case Vgpr16: case Vgpr32: @@ -524,6 +538,8 @@ RegBankLegalizeHelper::getRegBankFromID(RegBankLLTMappingApplyID ID) { case VgprB128: case VgprB256: case VgprB512: + case Vgpr32SExt: + case Vgpr32ZExt: return VgprRB; default: return nullptr; @@ -72
[llvm-branch-commits] [clang-tools-extra] [clang-doc][NFC] Remove unnecessary directory cleanup (PR #132101)
ilovepi wrote: ### Merge activity * **Mar 20, 5:02 PM EDT**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/132101). https://github.com/llvm/llvm-project/pull/132101 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] llvm-reduce: Change function return types if function is not called (PR #134035)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff HEAD~1 HEAD --extensions cpp -- llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp `` View the diff from clang-format here. ``diff diff --git a/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp b/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp index b4df3e6dd..72cfa8305 100644 --- a/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp +++ b/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp @@ -96,7 +96,6 @@ static void rewriteFuncWithReturnType(Function &OldF, Value *NewRetValue) { // result of our pruning here. EliminateUnreachableBlocks(OldF); - // Drop the incompatible attributes before we copy over to the new function. if (OldRetTy != NewRetTy) { AttributeList AL = OldF.getAttributes(); `` https://github.com/llvm/llvm-project/pull/134035 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: reformulate the state for data-flow analysis (PR #131898)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/131898 >From 1b82382369a66bee4345489ce8fc70abf04215a7 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 17 Mar 2025 22:27:53 +0300 Subject: [PATCH 1/2] [BOLT] Gadget scanner: reformulate the state for data-flow analysis In preparation for implementing support for detection of non-protected call instructions, refine the definition of state which is computed for each register by data-flow analysis. Explicitly marking the registers which are known to be trusted at function entry is crucial for finding non-protected calls. In addition, it fixes less-common false negatives for pac-ret, such as `ret x1` in `f_nonx30_ret_non_auted` test case. --- bolt/include/bolt/Core/MCPlusBuilder.h| 10 ++ bolt/include/bolt/Passes/PAuthGadgetScanner.h | 7 +- bolt/lib/Passes/PAuthGadgetScanner.cpp| 129 +++--- .../Target/AArch64/AArch64MCPlusBuilder.cpp | 4 + .../AArch64/gs-pacret-autiasp.s | 19 ++- .../AArch64/gs-pacret-multi-bb.s | 3 +- 6 files changed, 104 insertions(+), 68 deletions(-) diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index b285138b77fe7..76ea2489e7038 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -551,6 +551,16 @@ class MCPlusBuilder { return Analysis->isReturn(Inst); } + /// Returns the registers that are trusted at function entry. + /// + /// Each register should be treated as if a successfully authenticated + /// pointer was written to it before entering the function (i.e. the + /// pointer is safe to jump to as well as to be signed). + virtual SmallVector getTrustedLiveInRegs() const { +llvm_unreachable("not implemented"); +return {}; + } + virtual ErrorOr getAuthenticatedReg(const MCInst &Inst) const { llvm_unreachable("not implemented"); return getNoRegister(); diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index f102f1080e2e8..404dde2901767 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -209,13 +209,12 @@ struct Report { struct GadgetReport : public Report { const GadgetKind &Kind; - SmallVector AffectedRegisters; + SmallVector AffectedRegisters; std::vector OverwritingInstrs; GadgetReport(const GadgetKind &Kind, MCInstReference Location, - const BitVector &AffectedRegisters) - : Report(Location), Kind(Kind), -AffectedRegisters(AffectedRegisters.set_bits()) {} + MCPhysReg AffectedRegister) + : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {} void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index 4f7be17327b49..c81a586b02771 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -126,18 +126,16 @@ class TrackedRegisters { // The security property that is checked is: // When a register is used as the address to jump to in a return instruction, -// that register must either: -// (a) never be changed within this function, i.e. have the same value as when -// the function started, or +// that register must be safe-to-dereference. It must either +// (a) be safe-to-dereference at function entry and never be changed within this +// function, i.e. have the same value as when the function started, or // (b) the last write to the register must be by an authentication instruction. // This property is checked by using dataflow analysis to keep track of which -// registers have been written (def-ed), since last authenticated. Those are -// exactly the registers containing values that should not be trusted (as they -// could have changed since the last time they were authenticated). For pac-ret, -// any return instruction using such a register is a gadget to be reported. For -// PAuthABI, probably at least any indirect control flow using such a register -// should be reported. +// registers have been written (def-ed), since last authenticated. For pac-ret, +// any return instruction using a register which is not safe-to-dereference is +// a gadget to be reported. For PAuthABI, probably at least any indirect control +// flow using such a register should be reported. // Furthermore, when producing a diagnostic for a found non-pac-ret protected // return, the analysis also lists the last instructions that wrote to the @@ -156,10 +154,29 @@ class TrackedRegisters { //in the gadgets to be reported. This information is used in the second run //to also track which instructions last wrote to those registers. +/// A state representing which registers are safe to use by an instruction +/// at a given program p
[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)
https://github.com/kasuga-fj created https://github.com/llvm/llvm-project/pull/133667 In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130 >From 2db59e8629d3640ec070eb906ac55a5e970176d1 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Thu, 27 Mar 2025 09:52:16 + Subject: [PATCH] [LoopInterchange] Fix the vectorizable check for a loop In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130 --- .../lib/Transforms/Scalar/LoopInterchange.cpp | 41 +++ .../profitability-vectorization-heuristic.ll | 9 ++-- 2 files changed, 27 insertions(+), 23 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp index e777f950a7c5a..b6b0b7d7a947a 100644 --- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp +++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp @@ -1197,25 +1197,32 @@ LoopInterchangeProfitability::isProfitablePerInstrOrderCost() { return std::nullopt; } +/// Return true if we can vectorize the loop specified by \p LoopId. +static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) { + for (unsigned I = 0; I != DepMatrix.size(); I++) { +char Dir = DepMatrix[I][LoopId]; +if (Dir != 'I' && Dir != '=') + return false; + } + return true; +} + std::optional LoopInterchangeProfitability::isProfitableForVectorization( unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) { - for (auto &Row : DepMatrix) { -// If the inner loop is loop independent or doesn't carry any dependency -// it is not profitable to move this to outer position, since we are -// likely able to do inner loop vectorization already. -if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=') - return std::optional(false); - -// If the outer loop is not loop independent it is not profitable to move -// this to inner position, since doing so would not enable inner loop -// parallelism. -if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=') - return std::optional(false); - } - // If inner loop has dependence and outer loop is loop independent then it - // is/ profitable to interchange to enable inner loop parallelism. - // If there are no dependences, interchanging will not improve anything. - return std::optional(!DepMatrix.empty()); + // If the outer loop is not loop independent it is not profitable to move + // this to inner position, since doing so would not enable inner loop + // parallelism. + if (!canVectorize(DepMatrix, OuterLoopId)) +return false; + + // If inner loop has dependence and outer loop is loop independent then it is + // profitable to interchange to enable inner loop parallelism. + if (!canVectorize(DepMatrix, InnerLoopId)) +return true; + + // TODO: Estimate the cost of vectorized loop body when both the outer and the + // inner loop can be vectorized. + return std::nullopt; } bool LoopInterchangeProfitability::isProfitable( diff --git a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll index 606117e70db86..b82dd5141a6b2 100644 --- a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll +++ b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll @@ -15,16 +15,13 @@ ; } ; } ; -; FIXME: These loops are not exchanged at this time due to the problem of -; profitablity heuristic for vectorization. -; CHECK: --- !Missed +; CHECK: --- !Passed ; CHECK-NEXT: Pass:loop-interchange -; CHECK-NEXT: Name:InterchangeNotProfitable +; CHECK-NEXT: Name:Interchanged ; CHECK-NEXT: Function:interchange_necesasry_for_vectorization ; CHECK-NEXT: Args: -; CHECK-NEXT: - String: Interchanging loops is not considered to improve cache locality nor vectori
[llvm-branch-commits] [llvm] llvm-reduce: Fix losing call metadata in operands-to-args (PR #133422)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/133422 >From 1c18bf5fe4ccec532eaef3677e40e976dd2d460c Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 28 Mar 2025 18:01:39 +0700 Subject: [PATCH] llvm-reduce: Fix using call metadata in operands-to-args --- .../tools/llvm-reduce/operands-to-args-preserve-fmf.ll | 7 +-- llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp | 2 ++ 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll index b4b19ca28dbb5..fc31a08353b8f 100644 --- a/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll +++ b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll @@ -12,9 +12,12 @@ define float @callee(float %a) { ; INTERESTING: load float ; REDUCED-LABEL: define float @caller(ptr %ptr, float %val, float %callee.ret1) { -; REDUCED: %callee.ret12 = call nnan nsz float @callee(float %val, float 0.00e+00) +; REDUCED: %callee.ret12 = call nnan nsz float @callee(float %val, float 0.00e+00), !fpmath !0 define float @caller(ptr %ptr) { %val = load float, ptr %ptr - %callee.ret = call nnan nsz float @callee(float %val) + %callee.ret = call nnan nsz float @callee(float %val), !fpmath !0 ret float %callee.ret } + +; REDUCED: !0 = !{float 2.00e+00} +!0 = !{float 2.0} diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp index e7ad52eb65a5d..33f6463be6581 100644 --- a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp +++ b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp @@ -111,6 +111,8 @@ static void replaceFunctionCalls(Function *OldF, Function *NewF) { if (auto *FPOp = dyn_cast(NewCI)) NewCI->setFastMathFlags(CI->getFastMathFlags()); +NewCI->copyMetadata(*CI); + // Do the replacement for this use. if (!CI->use_empty()) CI->replaceAllUsesWith(NewCI); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang][OpenMP] Map simple `do concurrent` loops to OpenMP host constructs (PR #127633)
https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/127633 >From f6a61dc9d383f19fa1cf38173829f2a732a4d544 Mon Sep 17 00:00:00 2001 From: ergawy Date: Tue, 18 Feb 2025 02:50:46 -0600 Subject: [PATCH 1/3] [flang][OpenMP] Map simple `do concurrent` loops to OpenMP host constructs Upstreams one more part of the ROCm `do concurrent` to OpenMP mapping pass. This PR add support for converting simple loops to the equivalent OpenMP constructs on the host: `omp parallel do`. Towards that end, we have to collect more information about loop nests for which we add new utils in the `looputils` name space. --- flang/docs/DoConcurrentConversionToOpenMP.md | 47 .../OpenMP/DoConcurrentConversion.cpp | 211 +- .../Transforms/DoConcurrent/basic_host.f90| 14 +- .../Transforms/DoConcurrent/basic_host.mlir | 62 + .../DoConcurrent/non_const_bounds.f90 | 45 .../DoConcurrent/not_perfectly_nested.f90 | 45 6 files changed, 405 insertions(+), 19 deletions(-) create mode 100644 flang/test/Transforms/DoConcurrent/basic_host.mlir create mode 100644 flang/test/Transforms/DoConcurrent/non_const_bounds.f90 create mode 100644 flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 diff --git a/flang/docs/DoConcurrentConversionToOpenMP.md b/flang/docs/DoConcurrentConversionToOpenMP.md index 7b49af742f242..19611615ee9d6 100644 --- a/flang/docs/DoConcurrentConversionToOpenMP.md +++ b/flang/docs/DoConcurrentConversionToOpenMP.md @@ -126,6 +126,53 @@ see the "Data environment" section below. See `flang/test/Transforms/DoConcurrent/loop_nest_test.f90` for more examples of what is and is not detected as a perfect loop nest. +### Single-range loops + +Given the following loop: +```fortran + do concurrent(i=1:n) +a(i) = i * i + end do +``` + + Mapping to `host` + +Mapping this loop to the `host`, generates MLIR operations of the following +structure: + +``` +%4 = fir.address_of(@_QFEa) ... +%6:2 = hlfir.declare %4 ... + +omp.parallel { + // Allocate private copy for `i`. + // TODO Use delayed privatization. + %19 = fir.alloca i32 {bindc_name = "i"} + %20:2 = hlfir.declare %19 {uniq_name = "_QFEi"} ... + + omp.wsloop { +omp.loop_nest (%arg0) : index = (%21) to (%22) inclusive step (%c1_2) { + %23 = fir.convert %arg0 : (index) -> i32 + // Use the privatized version of `i`. + fir.store %23 to %20#1 : !fir.ref + ... + + // Use "shared" SSA value of `a`. + %42 = hlfir.designate %6#0 + hlfir.assign %35 to %42 + ... + omp.yield +} +omp.terminator + } + omp.terminator +} +``` + + Mapping to `device` + + +
[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)
ritter-x2a wrote: > Maybe we could consider adding "ISD::PTRADD"? Lowers to ISD::ADD by default, > but targets that want to do weird things with pointer arithmetic could do > them. That would be helpful. We'd still need an inbounds flag for ISD::PTRADD, but it would certainly be easier to make use of. I'll look into that. > One other concern, which applies to basically any formulation of this: Since > SelectionDAG doesn't have a distinct pointer type, you can't tell whether the > pointer operand was produced by an inttoptr. So in some cases, you have an > operation marked "inbounds", but it's ambiguous which object it's actually > inbounds to. This isn't really a problem at the moment because we do IR-level > transforms that remove inttoptr anyway, but if we ever do resolve the > IR-level issues, we should have some idea for how we propagate the fix to > SelectionDAG. I can see that's a problem if you'd want to infer that an operation is inbounds, or if you'd want to prove the absence (or presence) of poison/UB. But how is that a problem for generating code? If there is an inbounds flag on a (hypothetical) ISD::PTRADD, we can assume that the operation is inbounds with respect to whatever the address operand is pointing to, no matter if it's the result of integer operations, right? https://github.com/llvm/llvm-project/pull/131862 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 7f18a2f - Revert "[X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VP…"
Author: Simon Pilgrim Date: 2025-04-03T16:00:07+01:00 New Revision: 7f18a2fa9567050a245f3992963752a74cdff884 URL: https://github.com/llvm/llvm-project/commit/7f18a2fa9567050a245f3992963752a74cdff884 DIFF: https://github.com/llvm/llvm-project/commit/7f18a2fa9567050a245f3992963752a74cdff884.diff LOG: Revert "[X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VP…" This reverts commit bf516098fb7c7d428cae03296b92766467f76c9e. Added: Modified: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast_from_memory.ll llvm/test/CodeGen/X86/shuffle-vs-trunc-128.ll llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-5.ll llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-5.ll llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-6.ll llvm/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll llvm/test/CodeGen/X86/zero_extend_vector_inreg_of_broadcast.ll llvm/test/CodeGen/X86/zero_extend_vector_inreg_of_broadcast_from_memory.ll Removed: diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index d1be19539b642..546a2d22fa58e 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -43827,69 +43827,6 @@ bool X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode( } break; } -case X86ISD::VPERMV: { - SmallVector Mask; - SmallVector Ops; - if ((VT.is256BitVector() || Subtarget.hasVLX()) && - getTargetShuffleMask(Op, /*AllowSentinelZero=*/false, Ops, Mask)) { -// For lane-crossing shuffles, only split in half in case we're still -// referencing higher elements. -unsigned HalfElts = NumElts / 2; -unsigned HalfSize = SizeInBits / 2; -Mask.resize(HalfElts); -if (all_of(Mask, - [&](int M) { return isUndefOrInRange(M, 0, HalfElts); })) { - MVT HalfVT = VT.getSimpleVT().getHalfNumVectorElementsVT(); - SDLoc DL(Op); - SDValue Ext; - SDValue M = - extractSubVector(Op.getOperand(0), 0, TLO.DAG, DL, HalfSize); - SDValue V = - extractSubVector(Op.getOperand(1), 0, TLO.DAG, DL, HalfSize); - // For 128-bit v2X64/v4X32 instructions, use VPERMILPD/VPERMILPS. - if (VT.is512BitVector() || VT.getScalarSizeInBits() <= 16) -Ext = TLO.DAG.getNode(Opc, DL, HalfVT, M, V); - else -Ext = TLO.DAG.getNode(X86ISD::VPERMILPV, DL, HalfVT, V, M); - SDValue Insert = widenSubVector(Ext, /*ZeroNewElements=*/false, - Subtarget, TLO.DAG, DL, SizeInBits); - return TLO.CombineTo(Op, Insert); -} - } - break; -} -case X86ISD::VPERMV3: { - SmallVector Mask; - SmallVector Ops; - if (Subtarget.hasVLX() && - getTargetShuffleMask(Op, /*AllowSentinelZero=*/false, Ops, Mask)) { -// For lane-crossing shuffles, only split in half in case we're still -// referencing higher elements. -unsigned HalfElts = NumElts / 2; -unsigned HalfSize = SizeInBits / 2; -Mask.resize(HalfElts); -if (all_of(Mask, [&](int M) { - return isUndefOrInRange(M, 0, HalfElts) || - isUndefOrInRange(M, NumElts, NumElts + HalfElts); -})) { - // Adjust mask elements for 2nd operand to point to half width. - for (int &M : Mask) -M = M <= NumElts ? M : (M - HalfElts); - MVT HalfVT = VT.getSimpleVT().getHalfNumVectorElementsVT(); - MVT HalfIntVT = HalfVT.changeVectorElementTypeToInteger(); - SDLoc DL(Op); - SDValue Ext = TLO.DAG.getNode( - Opc, DL, HalfVT, - extractSubVector(Op.getOperand(0), 0, TLO.DAG, DL, HalfSize), - getConstVector(Mask, HalfIntVT, TLO.DAG, DL, /*IsMask=*/true), - extractSubVector(Op.getOperand(2), 0, TLO.DAG, DL, HalfSize)); - SDValue Insert = widenSubVector(Ext, /*ZeroNewElements=*/false, - Subtarget, TLO.DAG, DL, SizeInBits); - return TLO.CombineTo(Op, Insert); -} - } - break; -} case X86ISD::VPERM2X128: { // Simplify VPERM2F128/VPERM2I128 to extract_subvector. SDLoc DL(Op); diff --git a/llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll b/llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll index b075d48627b18..6f4e7abda8b00 100644 --- a/llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll +++ b/llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll @@ -749,10 +749,10 @@ define void @vec128_i16_widen_to_i32_factor2_broadcast_to_v4i
[llvm-branch-commits] [llvm] [CodeGen][StaticDataSplitter]Support constant pool partitioning (PR #129781)
@@ -2769,6 +2769,23 @@ namespace { } // end anonymous namespace +StringRef AsmPrinter::getConstantSectionSuffix(const Constant *C) const { + SmallString<8> SectionNameSuffix; + if (TM.Options.EnableStaticDataPartitioning) { +if (C && SDPI && PSI) { + auto Count = SDPI->getConstantProfileCount(C); + if (Count) { +if (PSI->isHotCount(*Count)) { + SectionNameSuffix.append("hot"); +} else if (PSI->isColdCount(*Count) && !SDPI->hasUnknownCount(C)) { + SectionNameSuffix.append("unlikely"); +} + } +} + } + return SectionNameSuffix.str(); mingmingl-llvm wrote: thanks for the catch! done. https://github.com/llvm/llvm-project/pull/129781 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: analyze functions without CFG information (PR #133461)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/133461 >From 43c034a7ccd057eb4e1c29daaa5f3ff882ae685a Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Wed, 19 Mar 2025 18:58:32 +0300 Subject: [PATCH 1/4] [BOLT] Gadget scanner: analyze functions without CFG information Support simple analysis of the functions for which BOLT is unable to reconstruct the CFG. This patch is inspired by the approach implemented by Kristof Beyls in the original prototype of gadget scanner, but a CFG-unaware counterpart of the data-flow analysis is implemented instead of separate version of gadget detector, as multiple gadget kinds are detected now. --- bolt/include/bolt/Core/BinaryFunction.h | 13 + bolt/include/bolt/Passes/PAuthGadgetScanner.h | 24 + bolt/lib/Passes/PAuthGadgetScanner.cpp| 266 +--- .../AArch64/gs-pacret-autiasp.s | 15 + .../binary-analysis/AArch64/gs-pauth-calls.s | 594 ++ 5 files changed, 835 insertions(+), 77 deletions(-) diff --git a/bolt/include/bolt/Core/BinaryFunction.h b/bolt/include/bolt/Core/BinaryFunction.h index d3d11f8c5fb73..5cb2cc95af695 100644 --- a/bolt/include/bolt/Core/BinaryFunction.h +++ b/bolt/include/bolt/Core/BinaryFunction.h @@ -799,6 +799,19 @@ class BinaryFunction { return iterator_range(cie_begin(), cie_end()); } + /// Iterate over instructions (only if CFG is unavailable or not built yet). + iterator_range instrs() { +assert(!hasCFG() && "Iterate over basic blocks instead"); +return make_range(Instructions.begin(), Instructions.end()); + } + iterator_range instrs() const { +assert(!hasCFG() && "Iterate over basic blocks instead"); +return make_range(Instructions.begin(), Instructions.end()); + } + + /// Returns whether there are any labels at Offset. + bool hasLabelAt(unsigned Offset) const { return Labels.count(Offset) != 0; } + /// Iterate over all jump tables associated with this function. iterator_range::const_iterator> jumpTables() const { diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index 622e6721dea55..aa44f8c565639 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -67,6 +67,14 @@ struct MCInstInBFReference { uint64_t Offset; MCInstInBFReference(BinaryFunction *BF, uint64_t Offset) : BF(BF), Offset(Offset) {} + + static MCInstInBFReference get(const MCInst *Inst, BinaryFunction &BF) { +for (auto &I : BF.instrs()) + if (Inst == &I.second) +return MCInstInBFReference(&BF, I.first); +return {}; + } + MCInstInBFReference() : BF(nullptr), Offset(0) {} bool operator==(const MCInstInBFReference &RHS) const { return BF == RHS.BF && Offset == RHS.Offset; @@ -106,6 +114,12 @@ struct MCInstReference { MCInstReference(BinaryFunction *BF, uint32_t Offset) : MCInstReference(MCInstInBFReference(BF, Offset)) {} + static MCInstReference get(const MCInst *Inst, BinaryFunction &BF) { +if (BF.hasCFG()) + return MCInstInBBReference::get(Inst, BF); +return MCInstInBFReference::get(Inst, BF); + } + bool operator<(const MCInstReference &RHS) const { if (ParentKind != RHS.ParentKind) return ParentKind < RHS.ParentKind; @@ -140,6 +154,16 @@ struct MCInstReference { llvm_unreachable(""); } + operator bool() const { +switch (ParentKind) { +case BasicBlockParent: + return U.BBRef.BB != nullptr; +case FunctionParent: + return U.BFRef.BF != nullptr; +} +llvm_unreachable(""); + } + uint64_t getAddress() const { switch (ParentKind) { case BasicBlockParent: diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index df9e87bd4e999..f5d224675d749 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -124,6 +124,27 @@ class TrackedRegisters { } }; +// Without CFG, we reset gadget scanning state when encountering an +// unconditional branch. Note that BC.MIB->isUnconditionalBranch neither +// considers indirect branches nor annotated tail calls as unconditional. +static bool isStateTrackingBoundary(const BinaryContext &BC, +const MCInst &Inst) { + const MCInstrDesc &Desc = BC.MII->get(Inst.getOpcode()); + // Adapted from llvm::MCInstrDesc::isUnconditionalBranch(). + return Desc.isBranch() && Desc.isBarrier(); +} + +template static void iterateOverInstrs(BinaryFunction &BF, T Fn) { + if (BF.hasCFG()) { +for (BinaryBasicBlock &BB : BF) + for (int64_t I = 0, E = BB.size(); I < E; ++I) +Fn(MCInstInBBReference(&BB, I)); + } else { +for (auto I : BF.instrs()) + Fn(MCInstInBFReference(&BF, I.first)); + } +} + // The security property that is checked is: // When a register is used as the address to jump to in a return instruction, // that register mus
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for bit shifts and sext-inreg (PR #132385)
https://github.com/petar-avramovic created https://github.com/llvm/llvm-project/pull/132385 Uniform S16 shifts have to be extended to S32 using appropriate Extend before lowering to S32 instruction. Uniform packed V2S16 are lowered to SGPR S32 instructions, other option is to use VALU packed V2S16 and ReadAnyLane. For uniform S32 and S64 and divergent S16, S32, S64 and V2S16 there are instructions available. >From 9a0eaa14fddc00648a09f2880cc16207dfa4e1de Mon Sep 17 00:00:00 2001 From: Petar Avramovic Date: Fri, 21 Mar 2025 13:12:11 +0100 Subject: [PATCH] AMDGPU/GlobalISel: add RegBankLegalize rules for bit shifts and sext-inreg Uniform S16 shifts have to be extended to S32 using appropriate Extend before lowering to S32 instruction. Uniform packed V2S16 are lowered to SGPR S32 instructions, other option is to use VALU packed V2S16 and ReadAnyLane. For uniform S32 and S64 and divergent S16, S32, S64 and V2S16 there are instructions available. --- .../Target/AMDGPU/AMDGPURegBankLegalize.cpp | 3 +- .../AMDGPU/AMDGPURegBankLegalizeHelper.cpp| 104 ++ .../AMDGPU/AMDGPURegBankLegalizeHelper.h | 5 + .../AMDGPU/AMDGPURegBankLegalizeRules.cpp | 45 - .../AMDGPU/AMDGPURegBankLegalizeRules.h | 11 ++ llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll | 10 +- llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll | 187 +- .../AMDGPU/GlobalISel/regbankselect-ashr.mir | 6 +- .../AMDGPU/GlobalISel/regbankselect-lshr.mir | 17 +- .../GlobalISel/regbankselect-sext-inreg.mir | 24 +-- .../AMDGPU/GlobalISel/regbankselect-shl.mir | 6 +- .../CodeGen/AMDGPU/GlobalISel/sext_inreg.ll | 34 ++-- llvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll| 10 +- 13 files changed, 311 insertions(+), 151 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp index 44f1b5419abb9..4fd776bec9492 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp @@ -23,6 +23,7 @@ #include "GCNSubtarget.h" #include "llvm/CodeGen/GlobalISel/CSEInfo.h" #include "llvm/CodeGen/GlobalISel/CSEMIRBuilder.h" +#include "llvm/CodeGen/GlobalISel/Utils.h" #include "llvm/CodeGen/MachineFunctionPass.h" #include "llvm/CodeGen/MachineUniformityAnalysis.h" #include "llvm/CodeGen/TargetPassConfig.h" @@ -306,7 +307,7 @@ bool AMDGPURegBankLegalize::runOnMachineFunction(MachineFunction &MF) { // Opcodes that support pretty much all combinations of reg banks and LLTs // (except S1). There is no point in writing rules for them. if (Opc == AMDGPU::G_BUILD_VECTOR || Opc == AMDGPU::G_UNMERGE_VALUES || -Opc == AMDGPU::G_MERGE_VALUES) { +Opc == AMDGPU::G_MERGE_VALUES || Opc == G_BITCAST) { RBLHelper.applyMappingTrivial(*MI); continue; } diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp index 0f5f3545ac8eb..59f16315bbd72 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp @@ -14,13 +14,16 @@ #include "AMDGPURegBankLegalizeHelper.h" #include "AMDGPUGlobalISelUtils.h" #include "AMDGPUInstrInfo.h" +#include "AMDGPURegBankLegalizeRules.h" #include "AMDGPURegisterBankInfo.h" #include "GCNSubtarget.h" #include "MCTargetDesc/AMDGPUMCTargetDesc.h" #include "llvm/CodeGen/GlobalISel/GenericMachineInstrs.h" +#include "llvm/CodeGen/GlobalISel/MIPatternMatch.h" #include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h" #include "llvm/CodeGen/MachineUniformityAnalysis.h" #include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/Support/ErrorHandling.h" #define DEBUG_TYPE "amdgpu-regbanklegalize" @@ -130,6 +133,28 @@ void RegBankLegalizeHelper::widenLoad(MachineInstr &MI, LLT WideTy, MI.eraseFromParent(); } +std::pair RegBankLegalizeHelper::unpackZExt(Register Reg) { + auto PackedS32 = B.buildBitcast(SgprRB_S32, Reg); + auto Mask = B.buildConstant(SgprRB_S32, 0x); + auto Lo = B.buildAnd(SgprRB_S32, PackedS32, Mask); + auto Hi = B.buildLShr(SgprRB_S32, PackedS32, B.buildConstant(SgprRB_S32, 16)); + return {Lo.getReg(0), Hi.getReg(0)}; +} + +std::pair RegBankLegalizeHelper::unpackSExt(Register Reg) { + auto PackedS32 = B.buildBitcast(SgprRB_S32, Reg); + auto Lo = B.buildSExtInReg(SgprRB_S32, PackedS32, 16); + auto Hi = B.buildAShr(SgprRB_S32, PackedS32, B.buildConstant(SgprRB_S32, 16)); + return {Lo.getReg(0), Hi.getReg(0)}; +} + +std::pair RegBankLegalizeHelper::unpackAExt(Register Reg) { + auto PackedS32 = B.buildBitcast(SgprRB_S32, Reg); + auto Lo = PackedS32; + auto Hi = B.buildLShr(SgprRB_S32, PackedS32, B.buildConstant(SgprRB_S32, 16)); + return {Lo.getReg(0), Hi.getReg(0)}; +} + void RegBankLegalizeHelper::lower(MachineInstr &MI, const RegBankLLTMapping &Mapping, SmallSet &WaterfallSgp
[llvm-branch-commits] [llvm] InlineFunction: Split inlining into predicate and apply functions (PR #134213)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: Matt Arsenault (arsenm) Changes This is to support a new inline function reduction in llvm-reduce, which should pre-filter callsites that are not eligible for inlining. This code was mostly structured as a match and apply, with a few exceptions. The ugliest piece is for propagating and verifying compatible getGC and personalities. Also collection of EHPad and the convergence token to use are now cached in InlineFunctionInfo. I was initially confused by the split between the checks performed here and isInlineViable, so better document how this system is supposed to work. It turns out this split does make sense, in that isInlineViable checks if it's possible based on the callee content and the ultimate inline depended on the callsite context. I think more renames of these functions would help, and isInlineViable should probably move out of InlineCost to be with these transfoms. --- Full diff: https://github.com/llvm/llvm-project/pull/134213.diff 3 Files Affected: - (modified) llvm/include/llvm/Analysis/InlineCost.h (+5-1) - (modified) llvm/include/llvm/Transforms/Utils/Cloning.h (+28) - (modified) llvm/lib/Transforms/Utils/InlineFunction.cpp (+81-42) ``diff diff --git a/llvm/include/llvm/Analysis/InlineCost.h b/llvm/include/llvm/Analysis/InlineCost.h index 90ee75773957a..ec59d54954e16 100644 --- a/llvm/include/llvm/Analysis/InlineCost.h +++ b/llvm/include/llvm/Analysis/InlineCost.h @@ -334,7 +334,11 @@ std::optional getInliningCostFeatures( ProfileSummaryInfo *PSI = nullptr, OptimizationRemarkEmitter *ORE = nullptr); -/// Minimal filter to detect invalid constructs for inlining. +/// Check if it is mechanically possible to inline the function \p Callee, based +/// on the contents of the function. +/// +/// See also \p CanInlineCallSite as an additional precondition necessary to +/// perform a valid inline in a particular use context. InlineResult isInlineViable(Function &Callee); // This pass is used to annotate instructions during the inline process for diff --git a/llvm/include/llvm/Transforms/Utils/Cloning.h b/llvm/include/llvm/Transforms/Utils/Cloning.h index ec1a1d5faa7e9..201e6ba2b491f 100644 --- a/llvm/include/llvm/Transforms/Utils/Cloning.h +++ b/llvm/include/llvm/Transforms/Utils/Cloning.h @@ -263,6 +263,9 @@ class InlineFunctionInfo { /// `InlinedCalls` above is used. SmallVector InlinedCallSites; + Value *ConvergenceControlToken = nullptr; + Instruction *CallSiteEHPad = nullptr; + /// Update profile for callee as well as cloned version. We need to do this /// for regular inlining, but not for inlining from sample profile loader. bool UpdateProfile; @@ -271,9 +274,34 @@ class InlineFunctionInfo { StaticAllocas.clear(); InlinedCalls.clear(); InlinedCallSites.clear(); +ConvergenceControlToken = nullptr; +CallSiteEHPad = nullptr; } }; +/// Check if it is legal to perform inlining of the function called by \p CB +/// into the caller at this particular use, and sets fields in \p IFI. +/// +/// This does not consider whether it is possible for the function callee itself +/// to be inlined; for that see isInlineViable. +InlineResult CanInlineCallSite(const CallBase &CB, InlineFunctionInfo &IFI); + +/// This should generally not be used, use InlineFunction instead. +/// +/// Perform mechanical inlining of \p CB into the caller. +/// +/// This does not perform any legality or profitability checks for the +/// inlining. This assumes that CanInlineCallSite was already called, populated +/// \p IFI, and returned InlineResult::success. +/// +/// Also assumes that isInlineViable returned InlineResult::success for the +/// called function. +void InlineFunctionImpl(CallBase &CB, InlineFunctionInfo &IFI, +bool MergeAttributes = false, +AAResults *CalleeAAR = nullptr, +bool InsertLifetime = true, +Function *ForwardVarArgsTo = nullptr); + /// This function inlines the called function into the basic /// block of the caller. This returns false if it is not possible to inline /// this call. The program is still in a well defined state if this occurs diff --git a/llvm/lib/Transforms/Utils/InlineFunction.cpp b/llvm/lib/Transforms/Utils/InlineFunction.cpp index 131fbe654c11c..7236cc0131eb9 100644 --- a/llvm/lib/Transforms/Utils/InlineFunction.cpp +++ b/llvm/lib/Transforms/Utils/InlineFunction.cpp @@ -2446,19 +2446,8 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI, return Ret; } -/// This function inlines the called function into the basic block of the -/// caller. This returns false if it is not possible to inline this call. -/// The program is still in a well defined state if this occurs though. -/// -/// Note that this only does one level of inlining. For example, if the -/// instruction 'call B' is inlined, and 'B' calls 'C',
[llvm-branch-commits] [llvm] llvm-reduce: Change function return types if function is not called (PR #134035)
llvmbot wrote: @llvm/pr-subscribers-llvm-ir Author: Matt Arsenault (arsenm) Changes Extend the early return on value reduction to mutate the function return type if the function has no call uses. This could be generalized to rewrite cases where all callsites are used, but it turns out that complicates the visitation order given we try to compute all opportunities up front. This is enough to cleanup the common case where we end up with one function with a return of an uninteresting constant. --- Full diff: https://github.com/llvm/llvm-project/pull/134035.diff 2 Files Affected: - (added) llvm/test/tools/llvm-reduce/reduce-values-to-return-new-return-type.ll (+95) - (modified) llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp (+4-3) ``diff diff --git a/llvm/test/tools/llvm-reduce/reduce-values-to-return-new-return-type.ll b/llvm/test/tools/llvm-reduce/reduce-values-to-return-new-return-type.ll new file mode 100644 index 0..9ddbbe3def44f --- /dev/null +++ b/llvm/test/tools/llvm-reduce/reduce-values-to-return-new-return-type.ll @@ -0,0 +1,95 @@ +; Test that llvm-reduce can move intermediate values by inserting +; early returns when the function already has a different return type +; +; RUN: llvm-reduce --abort-on-invalid-reduction --delta-passes=instructions-to-return --test FileCheck --test-arg --check-prefix=INTERESTING --test-arg %s --test-arg --input-file %s -o %t +; RUN: FileCheck --check-prefix=RESULT %s < %t + + +@gv = global i32 0, align 4 +@ptr_array = global [2 x ptr] [ptr @inst_to_return_has_different_type_but_no_func_call_use, + ptr @multiple_callsites_wrong_return_type] + +; Should rewrite this return from i64 to i32 since the function has no +; uses. +; INTERESTING-LABEL: @inst_to_return_has_different_type_but_no_func_call_use( +; RESULT-LABEL: define i32 @inst_to_return_has_different_type_but_no_func_call_use(ptr %arg) { +; RESULT-NEXT: %load = load i32, ptr %arg, align 4 +; RESULT-NEXT: ret i32 %load +define i64 @inst_to_return_has_different_type_but_no_func_call_use(ptr %arg) { + %load = load i32, ptr %arg + store i32 %load, ptr @gv + ret i64 0 +} + +; INTERESTING-LABEL: @callsite_different_type_unused_0( +; RESULT-LABEL: define i64 @inst_to_return_has_different_type_but_call_result_unused( +; RESULT-NEXT: %load = load i32, ptr %arg +; RESULT-NEXT: store i32 %load, ptr @gv +; RESULT-NEXT: ret i64 0 +define void @callsite_different_type_unused_0(ptr %arg) { + %unused0 = call i64 @inst_to_return_has_different_type_but_call_result_unused(ptr %arg) + %unused1 = call i64 @inst_to_return_has_different_type_but_call_result_unused(ptr null) + ret void +} + +; TODO: Could rewrite this return from i64 to i32 since the callsite is unused. +; INTERESTING-LABEL: @inst_to_return_has_different_type_but_call_result_unused( +; RESULT-LABEL: define i64 @inst_to_return_has_different_type_but_call_result_unused( +; RESULT: ret i64 0 +define i64 @inst_to_return_has_different_type_but_call_result_unused(ptr %arg) { + %load = load i32, ptr %arg + store i32 %load, ptr @gv + ret i64 0 +} + +; INTERESTING-LABEL: @multiple_callsites_wrong_return_type( +; RESULT-LABEL: define i64 @multiple_callsites_wrong_return_type( +; RESULT: ret i64 0 +define i64 @multiple_callsites_wrong_return_type(ptr %arg) { + %load = load i32, ptr %arg + store i32 %load, ptr @gv + ret i64 0 +} + +; INTERESTING-LABEL: @unused_with_wrong_return_types( +; RESULT-LABEL: define i64 @unused_with_wrong_return_types( +; RESULT-NEXT: %unused0 = call i64 @multiple_callsites_wrong_return_type(ptr %arg) +; RESULT-NEXT: ret i64 %unused0 +define void @unused_with_wrong_return_types(ptr %arg) { + %unused0 = call i64 @multiple_callsites_wrong_return_type(ptr %arg) + %unused1 = call i32 @multiple_callsites_wrong_return_type(ptr %arg) + %unused2 = call ptr @multiple_callsites_wrong_return_type(ptr %arg) + ret void +} + +; INTERESTING-LABEL: @multiple_returns_wrong_return_type( +; INTERESTING: %load0 = load i32, + +; RESULT-LABEL: define i32 @multiple_returns_wrong_return_type( +; RESULT: ret i32 +; RESULT: ret i32 +; RESULT: ret i32 +define i32 @multiple_returns_wrong_return_type(ptr %arg, i1 %cond, i32 %arg2) { +entry: + br i1 %cond, label %bb0, label %bb1 + +bb0: + %load0 = load i32, ptr %arg + store i32 %load0, ptr @gv + ret i32 234 + +bb1: + ret i32 %arg2 + +bb2: + ret i32 34 +} + +; INTERESTING-LABEL: @call_multiple_returns_wrong_return_type( +; RESULT-LABEL: define <2 x i32> @call_multiple_returns_wrong_return_type( +; RESULT-NEXT: %unused = call <2 x i32> @multiple_returns_wrong_return_type( +; RESULT-NEXT: ret <2 x i32> %unused +define void @call_multiple_returns_wrong_return_type(ptr %arg, i1 %cond, i32 %arg2) { + %unused = call <2 x i32> @multiple_returns_wrong_return_type(ptr %arg, i1 %cond, i32 %arg2) + ret void +} diff --git a/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp b/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp ind
[llvm-branch-commits] [flang] [flang][OpenMP] Handle "loop-local values" in `do concurrent` nests (PR #127635)
https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/127635 >From 6321731e6e1cf412ed002571b9140d56ac5b76c6 Mon Sep 17 00:00:00 2001 From: ergawy Date: Tue, 18 Feb 2025 06:40:19 -0600 Subject: [PATCH] [flang][OpenMP] Handle "loop-local values" in `do concurrent` nests Extends `do concurrent` mapping to handle "loop-local values". A loop-local value is one that is used exclusively inside the loop but allocated outside of it. This usually corresponds to temporary values that are used inside the loop body for initialzing other variables for example. After collecting these values, the pass localizes them to the loop nest by moving their allocations. --- flang/docs/DoConcurrentConversionToOpenMP.md | 51 ++ .../OpenMP/DoConcurrentConversion.cpp | 68 ++- .../DoConcurrent/locally_destroyed_temp.f90 | 62 + 3 files changed, 180 insertions(+), 1 deletion(-) create mode 100644 flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 diff --git a/flang/docs/DoConcurrentConversionToOpenMP.md b/flang/docs/DoConcurrentConversionToOpenMP.md index ecb4428d7d3ba..76c54f5bbf587 100644 --- a/flang/docs/DoConcurrentConversionToOpenMP.md +++ b/flang/docs/DoConcurrentConversionToOpenMP.md @@ -202,6 +202,57 @@ variables: `i` and `j`. These are locally allocated inside the parallel/target OpenMP region similar to what the single-range example in previous section shows. +### Data environment + +By default, variables that are used inside a `do concurrent` loop nest are +either treated as `shared` in case of mapping to `host`, or mapped into the +`target` region using a `map` clause in case of mapping to `device`. The only +exceptions to this are: + 1. the loop's iteration variable(s) (IV) of **perfect** loop nests. In that + case, for each IV, we allocate a local copy as shown by the mapping + examples above. + 1. any values that are from allocations outside the loop nest and used + exclusively inside of it. In such cases, a local privatized + copy is created in the OpenMP region to prevent multiple teams of threads + from accessing and destroying the same memory block, which causes runtime + issues. For an example of such cases, see + `flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90`. + +Implicit mapping detection (for mapping to the target device) is still quite +limited and work to make it smarter is underway for both OpenMP in general +and `do concurrent` mapping. + + Non-perfectly-nested loops' IVs + +For non-perfectly-nested loops, the IVs are still treated as `shared` or +`map` entries as pointed out above. This **might not** be consistent with what +the Fortran specification tells us. In particular, taking the following +snippets from the spec (version 2023) into account: + +> § 3.35 +> -- +> construct entity +> entity whose identifier has the scope of a construct + +> § 19.4 +> -- +> A variable that appears as an index-name in a FORALL or DO CONCURRENT +> construct [...] is a construct entity. A variable that has LOCAL or +> LOCAL_INIT locality in a DO CONCURRENT construct is a construct entity. +> [...] +> The name of a variable that appears as an index-name in a DO CONCURRENT +> construct, FORALL statement, or FORALL construct has a scope of the statement +> or construct. A variable that has LOCAL or LOCAL_INIT locality in a DO +> CONCURRENT construct has the scope of that construct. + +From the above quotes, it seems there is an equivalence between the IV of a `do +concurrent` loop and a variable with a `LOCAL` locality specifier (equivalent +to OpenMP's `private` clause). Which means that we should probably +localize/privatize a `do concurrent` loop's IV even if it is not perfectly +nested in the nest we are parallelizing. For now, however, we **do not** do +that as pointed out previously. In the near future, we propose a middle-ground +solution (see the Next steps section for more details). +
[llvm-branch-commits] [compiler-rt] [compiler-rt][Darwin][x86] Fix instrprof-darwin-exports test (#131425) (PR #132500)
https://github.com/j-hui created https://github.com/llvm/llvm-project/pull/132500 ld64 issues a warning about section alignment which was counted as an unexpected exported symbol and the test failed. Fixed by disabling all linker warnings using -Wl,-w. cherry-picked from commit 94426df66a8d7c2321f9e197e5ef9636b0d5ce70 >From b03be06b732890f7e9fb445d9d71aec33408ea90 Mon Sep 17 00:00:00 2001 From: David Tellenbach Date: Mon, 17 Mar 2025 17:23:58 -0700 Subject: [PATCH] [compiler-rt][Darwin][x86] Fix instrprof-darwin-exports test (#131425) ld64 issues a warning about section alignment which was counted as an unexpected exported symbol and the test failed. Fixed by disabling all linker warnings using -Wl,-w. --- compiler-rt/test/profile/instrprof-darwin-exports.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/compiler-rt/test/profile/instrprof-darwin-exports.c b/compiler-rt/test/profile/instrprof-darwin-exports.c index 079d5d28ed24d..1a2ac8c813272 100644 --- a/compiler-rt/test/profile/instrprof-darwin-exports.c +++ b/compiler-rt/test/profile/instrprof-darwin-exports.c @@ -7,13 +7,13 @@ // just "_main" produces no warnings or errors. // // RUN: echo "_main" > %t.exports -// RUN: %clang_pgogen -Werror -Wl,-exported_symbols_list,%t.exports -o %t %s 2>&1 | tee %t.log -// RUN: %clang_profgen -Werror -fcoverage-mapping -Wl,-exported_symbols_list,%t.exports -o %t %s 2>&1 | tee -a %t.log +// RUN: %clang_pgogen -Werror -Wl,-exported_symbols_list,%t.exports -Wl,-w -o %t %s 2>&1 | tee %t.log +// RUN: %clang_profgen -Werror -fcoverage-mapping -Wl,-exported_symbols_list,%t.exports -Wl,-w -o %t %s 2>&1 | tee -a %t.log // RUN: cat %t.log | count 0 // 2) Ditto (1), but for GCOV. // -// RUN: %clang -Werror -Wl,-exported_symbols_list,%t.exports --coverage -o %t.gcov %s | tee -a %t.gcov.log +// RUN: %clang -Werror -Wl,-exported_symbols_list,%t.exports -Wl,-w --coverage -o %t.gcov %s | tee -a %t.gcov.log // RUN: cat %t.gcov.log | count 0 // 3) The default set of weak external symbols should match the set of symbols ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] llvm-reduce: Fix losing call metadata in operands-to-args (PR #133422)
https://github.com/shiltian approved this pull request. https://github.com/llvm/llvm-project/pull/133422 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/131899 >From 317a2d79f2b810be89f11fcf7afaa6f92c245e61 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 18 Mar 2025 21:32:11 +0300 Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect non-protected indirect calls --- bolt/include/bolt/Core/MCPlusBuilder.h| 10 + bolt/lib/Passes/PAuthGadgetScanner.cpp| 33 +- .../Target/AArch64/AArch64MCPlusBuilder.cpp | 42 ++ .../binary-analysis/AArch64/gs-pauth-calls.s | 676 ++ 4 files changed, 757 insertions(+), 4 deletions(-) create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-calls.s diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 76ea2489e7038..b3d54ccd5955d 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -577,6 +577,16 @@ class MCPlusBuilder { return getNoRegister(); } + /// Returns the register used as call destination, or no-register, if not + /// an indirect call. Sets IsAuthenticatedInternally if the instruction + /// accepts signed pointer as its operand and authenticates it internally. + virtual MCPhysReg + getRegUsedAsCallDest(const MCInst &Inst, + bool &IsAuthenticatedInternally) const { +llvm_unreachable("not implemented"); +return getNoRegister(); + } + virtual bool isTerminator(const MCInst &Inst) const; virtual bool isNoop(const MCInst &Inst) const { diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index 93a452b224233..5b3bfb487d33b 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -382,11 +382,11 @@ class PacRetAnalysis public: std::vector - getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF, - const ArrayRef UsedDirtyRegs) const { + getLastClobberingInsts(const MCInst &Inst, BinaryFunction &BF, + const ArrayRef UsedDirtyRegs) { if (RegsToTrackInstsFor.empty()) return {}; -auto MaybeState = getStateAt(Ret); +auto MaybeState = getStateBefore(Inst); if (!MaybeState) llvm_unreachable("Expected State to be present"); const State &S = *MaybeState; @@ -434,6 +434,29 @@ static std::shared_ptr tryCheckReturn(const BinaryContext &BC, return std::make_shared(RetKind, Inst, RetReg); } +static std::shared_ptr tryCheckCall(const BinaryContext &BC, +const MCInstReference &Inst, +const State &S) { + static const GadgetKind CallKind("non-protected call found"); + if (!BC.MIB->isCall(Inst) && !BC.MIB->isBranch(Inst)) +return nullptr; + + bool IsAuthenticated = false; + MCPhysReg DestReg = BC.MIB->getRegUsedAsCallDest(Inst, IsAuthenticated); + if (IsAuthenticated || DestReg == BC.MIB->getNoRegister()) +return nullptr; + + LLVM_DEBUG({ +traceInst(BC, "Found call inst", Inst); +traceReg(BC, "Call destination reg", DestReg); +traceRegMask(BC, "SafeToDerefRegs", S.SafeToDerefRegs); + }); + if (S.SafeToDerefRegs[DestReg]) +return nullptr; + + return std::make_shared(CallKind, Inst, DestReg); +} + FunctionAnalysisResult Analysis::computeDfState(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) { @@ -450,10 +473,12 @@ Analysis::computeDfState(BinaryFunction &BF, for (BinaryBasicBlock &BB : BF) { for (int64_t I = 0, E = BB.size(); I < E; ++I) { MCInstReference Inst(&BB, I); - const State &S = *PRA.getStateAt(Inst); + const State &S = *PRA.getStateBefore(Inst); if (auto Report = tryCheckReturn(BC, Inst, S)) Result.Diagnostics.push_back(Report); + if (auto Report = tryCheckCall(BC, Inst, S)) +Result.Diagnostics.push_back(Report); } } diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp index d238a1df5c7d7..9ce1514639f95 100644 --- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp +++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp @@ -277,6 +277,48 @@ class AArch64MCPlusBuilder : public MCPlusBuilder { } } + MCPhysReg + getRegUsedAsCallDest(const MCInst &Inst, + bool &IsAuthenticatedInternally) const override { +assert(isCall(Inst) || isBranch(Inst)); +IsAuthenticatedInternally = false; + +switch (Inst.getOpcode()) { +case AArch64::B: +case AArch64::BL: + assert(Inst.getOperand(0).isExpr()); + return getNoRegister(); +case AArch64::Bcc: +case AArch64::CBNZW: +case AArch64::CBNZX: +case AArch64::CBZW: +case AArch64::CBZX: + assert(Inst.getOperand(1).isExpr()); + return getNoRegister(); +case AArch64::TBNZW: +case AArch64::TBNZX: +case AArch64::TBZW: +case AArch64::TBZX: + assert(Ins
[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Add infastructure to parse parameters (PR #133800)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Finn Plummer (inbelic) Changes - defines `ParamType` as a way to represent a reference to some parameter in a root signature - defines `ParseParam` and `ParseParams` as an infastructure to define how the parameters of a given struct should be parsed in an orderless manner - implements parsing of two param types: `UInt32` and `Register` to demonstrate the parsing implementation and allow for unit testing Part two of implementing: https://github.com/llvm/llvm-project/issues/126569 --- Full diff: https://github.com/llvm/llvm-project/pull/133800.diff 5 Files Affected: - (modified) clang/include/clang/Basic/DiagnosticParseKinds.td (+4-1) - (modified) clang/include/clang/Parse/ParseHLSLRootSignature.h (+40) - (modified) clang/lib/Parse/ParseHLSLRootSignature.cpp (+151-14) - (modified) clang/unittests/Parse/ParseHLSLRootSignatureTest.cpp (+142-4) - (modified) llvm/include/llvm/Frontend/HLSL/HLSLRootSignature.h (+15) ``diff diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td index 2582e1e5ef0f6..ab12159ba5ae1 100644 --- a/clang/include/clang/Basic/DiagnosticParseKinds.td +++ b/clang/include/clang/Basic/DiagnosticParseKinds.td @@ -1830,8 +1830,11 @@ def err_hlsl_virtual_function def err_hlsl_virtual_inheritance : Error<"virtual inheritance is unsupported in HLSL">; -// HLSL Root Siganture diagnostic messages +// HLSL Root Signature Parser Diagnostics def err_hlsl_unexpected_end_of_params : Error<"expected %0 to denote end of parameters, or, another valid parameter of %1">; +def err_hlsl_rootsig_repeat_param : Error<"specified the same parameter '%0' multiple times">; +def err_hlsl_rootsig_missing_param : Error<"did not specify mandatory parameter '%0'">; +def err_hlsl_number_literal_overflow : Error<"integer literal is too large to be represented as a 32-bit %select{signed |}0 integer type">; } // end of Parser diagnostics diff --git a/clang/include/clang/Parse/ParseHLSLRootSignature.h b/clang/include/clang/Parse/ParseHLSLRootSignature.h index 43b41315b88b5..02e99e83875db 100644 --- a/clang/include/clang/Parse/ParseHLSLRootSignature.h +++ b/clang/include/clang/Parse/ParseHLSLRootSignature.h @@ -69,6 +69,46 @@ class RootSignatureParser { bool parseDescriptorTable(); bool parseDescriptorTableClause(); + /// Each unique ParamType will have a custom parse method defined that we can + /// use to invoke the parameters. + /// + /// This function will switch on the ParamType using std::visit and dispatch + /// onto the corresponding parse method + bool parseParam(llvm::hlsl::rootsig::ParamType Ref); + + /// Parameter arguments (eg. `bReg`, `space`, ...) can be specified in any + /// order, exactly once, and only a subset are mandatory. This function acts + /// as the infastructure to do so in a declarative way. + /// + /// For the example: + /// SmallDenseMap Params = { + ///TokenKind::bReg, &Clause.Register, + ///TokenKind::kw_space, &Clause.Space + /// }; + /// SmallDenseSet Mandatory = { + ///TokenKind::kw_numDescriptors + /// }; + /// + /// We can read it is as: + /// + /// when 'b0' is encountered, invoke the parse method for the type + /// of &Clause.Register (Register *) and update the parameter + /// when 'space' is encountered, invoke a parse method for the type + /// of &Clause.Space (uint32_t *) and update the parameter + /// + /// and 'bReg' must be specified + bool parseParams( + llvm::SmallDenseMap &Params, + llvm::SmallDenseSet &Mandatory); + + /// Parameter parse methods corresponding to a ParamType + bool parseUIntParam(uint32_t *X); + bool parseRegister(llvm::hlsl::rootsig::Register *Reg); + + /// Use NumericLiteralParser to convert CurToken.NumSpelling into a unsigned + /// 32-bit integer + bool handleUIntLiteral(uint32_t *X); + /// Invoke the Lexer to consume a token and update CurToken with the result void consumeNextToken() { CurToken = Lexer.ConsumeToken(); } diff --git a/clang/lib/Parse/ParseHLSLRootSignature.cpp b/clang/lib/Parse/ParseHLSLRootSignature.cpp index 33caca5fa1c82..62d29baea49d3 100644 --- a/clang/lib/Parse/ParseHLSLRootSignature.cpp +++ b/clang/lib/Parse/ParseHLSLRootSignature.cpp @@ -8,6 +8,8 @@ #include "clang/Parse/ParseHLSLRootSignature.h" +#include "clang/Lex/LiteralSupport.h" + #include "llvm/Support/raw_ostream.h" using namespace llvm::hlsl::rootsig; @@ -39,12 +41,11 @@ bool RootSignatureParser::parse() { break; } - if (!tryConsumeExpectedToken(TokenKind::end_of_stream)) { -getDiags().Report(CurToken.TokLoc, diag::err_hlsl_unexpected_end_of_params) -<< /*expected=*/TokenKind::end_of_stream -<< /*param of=*/TokenKind::kw_RootSignature; + if (consumeExpectedToken(TokenKind::end_of_stream, + diag::err_hlsl_unexpected_end_of_params, + /*param of=
[llvm-branch-commits] [llvm] [DAG][AArch64] Handle truncated buildvectors to allow and(subvector(anyext)) fold. (PR #133915)
https://github.com/davemgreen updated https://github.com/llvm/llvm-project/pull/133915 >From 35f44f31a41e485c7098a66bff99c4dfc424bb8d Mon Sep 17 00:00:00 2001 From: David Green Date: Tue, 1 Apr 2025 15:15:08 +0100 Subject: [PATCH 1/2] [DAG][AArch64] Handle truncated buildvectors to allow and(subvector(anyext)) fold. This fold was not handling the extended BUILDVECTORs that we see when i8/i16 are not legal types. Using isConstOrConstSplat(N1, false, true) allows it to match truncated constants. The other changes are to make sure that truncated values in N1C are treated correctly, the fold we are mostly interested in is ``` if (N0.getOpcode() == ISD::EXTRACT_SUBVECTOR && N0.hasOneUse() && N1C && ISD::isExtOpcode(N0.getOperand(0).getOpcode())) { ``` --- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 6 +- .../aarch64-neon-vector-insert-uaddlv.ll | 12 +-- llvm/test/CodeGen/AArch64/bitcast-extend.ll | 4 +- llvm/test/CodeGen/AArch64/ctlz.ll | 3 +- llvm/test/CodeGen/AArch64/ctpop.ll| 3 +- llvm/test/CodeGen/AArch64/itofp.ll| 90 +++ .../AArch64/vec3-loads-ext-trunc-stores.ll| 23 ++--- llvm/test/CodeGen/AArch64/vector-fcvt.ll | 36 +++- 8 files changed, 63 insertions(+), 114 deletions(-) diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp index dc5c5f38e3bd8..4bb52e9075297 100644 --- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp @@ -7166,7 +7166,8 @@ SDValue DAGCombiner::visitAND(SDNode *N) { // if (and x, c) is known to be zero, return 0 unsigned BitWidth = VT.getScalarSizeInBits(); - ConstantSDNode *N1C = isConstOrConstSplat(N1); + ConstantSDNode *N1C = + isConstOrConstSplat(N1, /*AllowUndef*/ false, /*AllowTrunc*/ true); if (N1C && DAG.MaskedValueIsZero(SDValue(N, 0), APInt::getAllOnes(BitWidth))) return DAG.getConstant(0, DL, VT); @@ -7205,7 +7206,8 @@ SDValue DAGCombiner::visitAND(SDNode *N) { return DAG.getNode(ISD::ZERO_EXTEND, DL, VT, N0Op0); // fold (and (any_ext V), c) -> (zero_ext (and (trunc V), c)) if profitable. -if (N1C->getAPIntValue().countLeadingZeros() >= (BitWidth - SrcBitWidth) && +APInt N1APInt = N1C->getAPIntValue().trunc(VT.getScalarSizeInBits()); +if (N1APInt.countLeadingZeros() >= (BitWidth - SrcBitWidth) && TLI.isTruncateFree(VT, SrcVT) && TLI.isZExtFree(SrcVT, VT) && TLI.isTypeDesirableForOp(ISD::AND, SrcVT) && TLI.isNarrowingProfitable(N, VT, SrcVT)) diff --git a/llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll b/llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll index 412f39f8adc1b..f37767291ca14 100644 --- a/llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll +++ b/llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll @@ -282,8 +282,7 @@ define void @insert_vec_v16i8_uaddlv_from_v8i8(ptr %0) { ; CHECK-NEXT:uaddlv.8b h1, v0 ; CHECK-NEXT:stp q0, q0, [x0, #32] ; CHECK-NEXT:mov.b v2[0], v1[0] -; CHECK-NEXT:zip1.8b v2, v2, v2 -; CHECK-NEXT:bic.4h v2, #255, lsl #8 +; CHECK-NEXT:ushll.8h v2, v2, #0 ; CHECK-NEXT:ushll.4s v2, v2, #0 ; CHECK-NEXT:ucvtf.4s v2, v2 ; CHECK-NEXT:stp q2, q0, [x0] @@ -305,8 +304,7 @@ define void @insert_vec_v8i8_uaddlv_from_v8i8(ptr %0) { ; CHECK-NEXT:stp xzr, xzr, [x0, #16] ; CHECK-NEXT:uaddlv.8b h1, v0 ; CHECK-NEXT:mov.b v0[0], v1[0] -; CHECK-NEXT:zip1.8b v0, v0, v0 -; CHECK-NEXT:bic.4h v0, #255, lsl #8 +; CHECK-NEXT:ushll.8h v0, v0, #0 ; CHECK-NEXT:ushll.4s v0, v0, #0 ; CHECK-NEXT:ucvtf.4s v0, v0 ; CHECK-NEXT:str q0, [x0] @@ -436,8 +434,7 @@ define void @insert_vec_v8i8_uaddlv_from_v4i32(ptr %0) { ; CHECK-NEXT:stp xzr, xzr, [x0, #16] ; CHECK-NEXT:uaddlv.4s d0, v0 ; CHECK-NEXT:mov.b v1[0], v0[0] -; CHECK-NEXT:zip1.8b v1, v1, v1 -; CHECK-NEXT:bic.4h v1, #255, lsl #8 +; CHECK-NEXT:ushll.8h v1, v1, #0 ; CHECK-NEXT:ushll.4s v1, v1, #0 ; CHECK-NEXT:ucvtf.4s v1, v1 ; CHECK-NEXT:str q1, [x0] @@ -461,8 +458,7 @@ define void @insert_vec_v16i8_uaddlv_from_v4i32(ptr %0) { ; CHECK-NEXT:uaddlv.4s d0, v0 ; CHECK-NEXT:stp q2, q2, [x0, #32] ; CHECK-NEXT:mov.b v1[0], v0[0] -; CHECK-NEXT:zip1.8b v1, v1, v1 -; CHECK-NEXT:bic.4h v1, #255, lsl #8 +; CHECK-NEXT:ushll.8h v1, v1, #0 ; CHECK-NEXT:ushll.4s v1, v1, #0 ; CHECK-NEXT:ucvtf.4s v1, v1 ; CHECK-NEXT:stp q1, q2, [x0] diff --git a/llvm/test/CodeGen/AArch64/bitcast-extend.ll b/llvm/test/CodeGen/AArch64/bitcast-extend.ll index 5dc335900a798..08a7493d0ba7f 100644 --- a/llvm/test/CodeGen/AArch64/bitcast-extend.ll +++ b/llvm/test/CodeGen/AArch64/bitcast-extend.ll @@ -6,8 +6,8 @@ define <4 x i16> @z_i32_v4i16(i32 %x) { ; CHECK-SD-LABEL: z_i32_v4i16: ; CHECK-SD: // %bb.0: ; CHECK-SD-NEXT:fmov s0, w0 -; CHECK-SD-NEXT:zip1 v0.8b, v0
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -0,0 +1,73 @@ +; RUN: llc <%s --mtriple s390x-ibm-zos --filetype=obj -o - | \ +; RUN: od -Ax -tx1 -v | FileCheck --ignore-case %s +; REQUIRES: systemz-registered-target redstar wrote: Removed. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang][OpenMP] Use OmpDirectiveSpecification in standalone directives (PR #131163)
https://github.com/Leporacanthicus approved this pull request. LGTM, thanks for the work! https://github.com/llvm/llvm-project/pull/131163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang-tools-extra] [clang] support pack expansions for trailing requires clauses (PR #133190)
https://github.com/mizvekov updated https://github.com/llvm/llvm-project/pull/133190 >From 65a4c47a81e9e294f5d3c8f1afbe1f9036ac8e4b Mon Sep 17 00:00:00 2001 From: Matheus Izvekov Date: Wed, 26 Mar 2025 18:38:34 -0300 Subject: [PATCH] [clang] support pack expansions for trailing requires clauses This fixes a crash when evaluating constraints from trailing requires clauses, when these are part of a generic lambda which is expanded. --- .../refactor/tweaks/ExtractVariable.cpp | 6 +-- clang/docs/ReleaseNotes.rst | 2 + clang/include/clang/AST/ASTNodeTraverser.h| 4 +- clang/include/clang/AST/Decl.h| 35 +++-- clang/include/clang/AST/DeclCXX.h | 20 clang/include/clang/AST/ExprCXX.h | 2 +- clang/include/clang/AST/RecursiveASTVisitor.h | 9 ++-- clang/include/clang/Sema/Sema.h | 14 ++--- clang/lib/AST/ASTContext.cpp | 7 ++- clang/lib/AST/ASTImporter.cpp | 5 +- clang/lib/AST/Decl.cpp| 16 +++--- clang/lib/AST/DeclCXX.cpp | 33 +++- clang/lib/AST/DeclPrinter.cpp | 10 ++-- clang/lib/AST/DeclTemplate.cpp| 4 +- clang/lib/AST/ExprCXX.cpp | 2 +- clang/lib/AST/ItaniumMangle.cpp | 4 +- clang/lib/ASTMatchers/ASTMatchFinder.cpp | 3 +- clang/lib/Index/IndexDecl.cpp | 4 +- clang/lib/Sema/SemaConcept.cpp| 6 +-- clang/lib/Sema/SemaDecl.cpp | 22 clang/lib/Sema/SemaDeclCXX.cpp| 4 +- clang/lib/Sema/SemaFunctionEffects.cpp| 2 +- clang/lib/Sema/SemaLambda.cpp | 18 --- clang/lib/Sema/SemaOverload.cpp | 12 +++-- clang/lib/Sema/SemaTemplateDeductionGuide.cpp | 51 --- .../lib/Sema/SemaTemplateInstantiateDecl.cpp | 4 +- clang/lib/Sema/TreeTransform.h| 7 ++- clang/lib/Serialization/ASTReaderDecl.cpp | 3 +- clang/lib/Serialization/ASTWriterDecl.cpp | 5 +- .../SemaCXX/fold_lambda_with_variadics.cpp| 9 clang/tools/libclang/CIndex.cpp | 2 +- 31 files changed, 191 insertions(+), 134 deletions(-) diff --git a/clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp b/clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp index d84e501b87ce7..90dac3b76c648 100644 --- a/clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp +++ b/clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp @@ -100,9 +100,9 @@ computeReferencedDecls(const clang::Expr *Expr) { TraverseLambdaCapture(LExpr, &Capture, Initializer); } - if (clang::Expr *const RequiresClause = - LExpr->getTrailingRequiresClause()) { -TraverseStmt(RequiresClause); + if (const clang::Expr *RequiresClause = + LExpr->getTrailingRequiresClause().ConstraintExpr) { +TraverseStmt(const_cast(RequiresClause)); } for (auto *const TemplateParam : LExpr->getExplicitTemplateParameters()) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index c4e82678949ff..f1066139c8514 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -373,6 +373,8 @@ Bug Fixes to C++ Support - Improved fix for an issue with pack expansions of type constraints, where this now also works if the constraint has non-type or template template parameters. (#GH131798) +- Fix crash when evaluating trailing requires clause of generic lambdas which are part of + a pack expansion. - Fixes matching of nested template template parameters. (#GH130362) - Correctly diagnoses template template paramters which have a pack parameter not in the last position. diff --git a/clang/include/clang/AST/ASTNodeTraverser.h b/clang/include/clang/AST/ASTNodeTraverser.h index f086d8134a64b..7bb435146f752 100644 --- a/clang/include/clang/AST/ASTNodeTraverser.h +++ b/clang/include/clang/AST/ASTNodeTraverser.h @@ -538,8 +538,8 @@ class ASTNodeTraverser for (const auto *Parameter : D->parameters()) Visit(Parameter); -if (const Expr *TRC = D->getTrailingRequiresClause()) - Visit(TRC); +if (const AssociatedConstraint &TRC = D->getTrailingRequiresClause()) + Visit(TRC.ConstraintExpr); if (Traversal == TK_IgnoreUnlessSpelledInSource && D->isDefaulted()) return; diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h index 9e7e93d98c9d1..adf3634d205bc 100644 --- a/clang/include/clang/AST/Decl.h +++ b/clang/include/clang/AST/Decl.h @@ -81,13 +81,17 @@ enum class ImplicitParamKind; // Holds a constraint expression along with a pack expansion index, if // expanded. struct AssociatedConstraint { - const Expr *ConstraintExpr; - int ArgumentPackSubstitutionIndex; + const Expr *ConstraintExpr = nullptr; + int ArgumentPackSubstitutionIndex = -1; + + constex
[llvm-branch-commits] [clang] [clang] Template Specialization Resugaring - Template Type Alias (PR #132442)
https://github.com/mizvekov updated https://github.com/llvm/llvm-project/pull/132442 >From 9d5d42820a4998e0e3eb74f7301aa34dca55b890 Mon Sep 17 00:00:00 2001 From: Matheus Izvekov Date: Mon, 30 May 2022 01:46:31 +0200 Subject: [PATCH] [clang] Template Specialization Resugaring - Template Type Alias This implements an additional user of the resugaring transform: the pattern of template type aliases. For more details and discussion see: https://discourse.llvm.org/t/rfc-improving-diagnostics-with-template-specialization-resugaring/64294 Differential Revision: https://reviews.llvm.org/D137199 --- clang/include/clang/Sema/Sema.h | 3 +- clang/lib/Sema/SemaCXXScopeSpec.cpp | 3 +- clang/lib/Sema/SemaCoroutine.cpp | 4 +- clang/lib/Sema/SemaDeclCXX.cpp| 6 ++- clang/lib/Sema/SemaTemplate.cpp | 43 +++ .../lib/Sema/SemaTemplateInstantiateDecl.cpp | 3 +- clang/lib/Sema/TreeTransform.h| 3 +- clang/test/AST/ast-dump-template-decls.cpp| 4 +- clang/test/Sema/Resugar/resugar-types.cpp | 6 +-- 9 files changed, 44 insertions(+), 31 deletions(-) diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h index 945ff5e2c2ca6..42a7bf75c3bfc 100644 --- a/clang/include/clang/Sema/Sema.h +++ b/clang/include/clang/Sema/Sema.h @@ -11509,7 +11509,8 @@ class Sema final : public SemaBase { void NoteAllFoundTemplates(TemplateName Name); - QualType CheckTemplateIdType(TemplateName Template, + QualType CheckTemplateIdType(const NestedNameSpecifier *NNS, + TemplateName Template, SourceLocation TemplateLoc, TemplateArgumentListInfo &TemplateArgs); diff --git a/clang/lib/Sema/SemaCXXScopeSpec.cpp b/clang/lib/Sema/SemaCXXScopeSpec.cpp index 1085639dcb355..1c7dff35bb8af 100644 --- a/clang/lib/Sema/SemaCXXScopeSpec.cpp +++ b/clang/lib/Sema/SemaCXXScopeSpec.cpp @@ -907,7 +907,8 @@ bool Sema::ActOnCXXNestedNameSpecifier(Scope *S, // We were able to resolve the template name to an actual template. // Build an appropriate nested-name-specifier. - QualType T = CheckTemplateIdType(Template, TemplateNameLoc, TemplateArgs); + QualType T = CheckTemplateIdType(SS.getScopeRep(), Template, TemplateNameLoc, + TemplateArgs); if (T.isNull()) return true; diff --git a/clang/lib/Sema/SemaCoroutine.cpp b/clang/lib/Sema/SemaCoroutine.cpp index 75364a3b2c8b5..8dffbca7463dd 100644 --- a/clang/lib/Sema/SemaCoroutine.cpp +++ b/clang/lib/Sema/SemaCoroutine.cpp @@ -90,7 +90,7 @@ static QualType lookupPromiseType(Sema &S, const FunctionDecl *FD, // Build the template-id. QualType CoroTrait = - S.CheckTemplateIdType(TemplateName(CoroTraits), KwLoc, Args); + S.CheckTemplateIdType(nullptr, TemplateName(CoroTraits), KwLoc, Args); if (CoroTrait.isNull()) return QualType(); if (S.RequireCompleteType(KwLoc, CoroTrait, @@ -169,7 +169,7 @@ static QualType lookupCoroutineHandleType(Sema &S, QualType PromiseType, // Build the template-id. QualType CoroHandleType = - S.CheckTemplateIdType(TemplateName(CoroHandle), Loc, Args); + S.CheckTemplateIdType(nullptr, TemplateName(CoroHandle), Loc, Args); if (CoroHandleType.isNull()) return QualType(); if (S.RequireCompleteType(Loc, CoroHandleType, diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp index 928bf47285490..8a9ad3271ec26 100644 --- a/clang/lib/Sema/SemaDeclCXX.cpp +++ b/clang/lib/Sema/SemaDeclCXX.cpp @@ -1140,7 +1140,8 @@ static bool lookupStdTypeTraitMember(Sema &S, LookupResult &TraitMemberLookup, } // Build the template-id. - QualType TraitTy = S.CheckTemplateIdType(TemplateName(TraitTD), Loc, Args); + QualType TraitTy = + S.CheckTemplateIdType(nullptr, TemplateName(TraitTD), Loc, Args); if (TraitTy.isNull()) return true; if (!S.isCompleteType(Loc, TraitTy)) { @@ -12163,7 +12164,8 @@ QualType Sema::BuildStdInitializerList(QualType Element, SourceLocation Loc) { Context.getTrivialTypeSourceInfo(Element, Loc))); - QualType T = CheckTemplateIdType(TemplateName(StdInitializerList), Loc, Args); + QualType T = + CheckTemplateIdType(nullptr, TemplateName(StdInitializerList), Loc, Args); if (T.isNull()) return QualType(); diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp index 5652b4548895a..673551bd97f3e 100644 --- a/clang/lib/Sema/SemaTemplate.cpp +++ b/clang/lib/Sema/SemaTemplate.cpp @@ -3827,7 +3827,8 @@ void Sema::NoteAllFoundTemplates(TemplateName Name) { } } -static QualType builtinCommonTypeImpl(Sema &S, TemplateName BaseTemplate, +static QualType builtinCommonTypeImpl(Sema &S, const NestedNameSpecifier *NNS, +
[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)
@@ -80,6 +80,21 @@ enum class RuleTy { ForVectorization, }; +/// Store the information about if corresponding direction vector was negated kasuga-fj wrote: > But I now guess that the complication here is the unique entries in the > dependency matrix, is that right? Yes. (But holding two boolean values is a bit redundant. What is actually needed are three states. If both of them are false, it is an illegal state.) > I am wondering if it isn't easier to keep all the entries and don't make them > unique? I think it would be simpler. Also, there is no need to stop making entries unique altogether. If duplicate direction vectors are allowed, I think the simplest implementation would be to keep pairs of a direction vector and a boolean value indicating whether the corresponding vector is negated. However, I'm not sure how effective it is to make direction vectors unique. In the worst case, holding pairs of a vector and a boolean value instead of a single vector doubles the number of entries. Is this allowed? https://github.com/llvm/llvm-project/pull/133672 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -0,0 +1,148 @@ +//===- MCGOFFSymbolMapper.h - Maps MC section/symbol to GOFF symbols --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// Maps a section or a symbol to the GOFF symbols it is composed of, and their +// attributes. +// +//===--===// + +#ifndef LLVM_MC_MCGOFFSYMBOLMAPPER_H +#define LLVM_MC_MCGOFFSYMBOLMAPPER_H + +#include "llvm/ADT/StringRef.h" +#include "llvm/BinaryFormat/GOFF.h" +#include "llvm/Support/Alignment.h" +#include +#include + +namespace llvm { +class MCAssembler; +class MCContext; +class MCSectionGOFF; + +// An "External Symbol Definition" in the GOFF file has a type, and depending on +// the type a different subset of the fields is used. +// +// Unlike other formats, a 2 dimensional structure is used to define the +// location of data. For example, the equivalent of the ELF .text section is +// made up of a Section Definition (SD) and a class (Element Definition; ED). +// The name of the SD symbol depends on the application, while the class has the +// predefined name C_CODE64. redstar wrote: It's AMODE not ILP :-) https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [ARM] Speedups for CombineBaseUpdate. (#129725) (PR #130035)
tstellar wrote: @DanielKristofKiss ping https://github.com/llvm/llvm-project/pull/130035 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [RISCV] Integrate RISCV target in baremetal toolchain object and deprecate RISCVToolchain object (PR #121831)
https://github.com/quic-garvgupt edited https://github.com/llvm/llvm-project/pull/121831 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for bit shifts and sext-inreg (PR #132385)
https://github.com/petar-avramovic ready_for_review https://github.com/llvm/llvm-project/pull/132385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for select (PR #132384)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Petar Avramovic (petar-avramovic) Changes Uniform condition S1 is AnyExtended to S32 and high bits are cleaned using AND with 1. Divergent S1 uses VCC. Using B32/B64 rules to cover scalars vector and pointer types. Divergent B64 is split to S32. --- Patch is 145.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/132384.diff 4 Files Affected: - (modified) llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp (+18-1) - (modified) llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp (+6-2) - (modified) llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h (+1) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir (+624-1277) ``diff diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp index 7301cba9e8ed3..0f5f3545ac8eb 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp @@ -243,6 +243,22 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI, MI.eraseFromParent(); break; } + case SplitTo32Sel: { +Register Dst = MI.getOperand(0).getReg(); +LLT Ty = MRI.getType(Dst) == V4S16 ? V2S16 : S32; +auto Op2 = B.buildUnmerge({VgprRB, Ty}, MI.getOperand(2).getReg()); +auto Op3 = B.buildUnmerge({VgprRB, Ty}, MI.getOperand(3).getReg()); +Register Cond = MI.getOperand(1).getReg(); +auto Flags = MI.getFlags(); +auto ResLo = +B.buildSelect({VgprRB, Ty}, Cond, Op2.getReg(0), Op3.getReg(0), Flags); +auto ResHi = +B.buildSelect({VgprRB, Ty}, Cond, Op2.getReg(1), Op3.getReg(1), Flags); + +B.buildMergeLikeInstr(Dst, {ResLo, ResHi}); +MI.eraseFromParent(); +break; + } case Div_BFE: { Register Dst = MI.getOperand(0).getReg(); assert(MRI.getType(Dst) == LLT::scalar(64)); @@ -453,7 +469,8 @@ LLT RegBankLegalizeHelper::getBTyFromID(RegBankLLTMappingApplyID ID, LLT Ty) { case UniInVgprB64: if (Ty == LLT::scalar(64) || Ty == LLT::fixed_vector(2, 32) || Ty == LLT::fixed_vector(4, 16) || Ty == LLT::pointer(0, 64) || -Ty == LLT::pointer(1, 64) || Ty == LLT::pointer(4, 64)) +Ty == LLT::pointer(1, 64) || Ty == LLT::pointer(4, 64) || +Ty == LLT::pointer(999, 64)) return Ty; return LLT(); case SgprB96: diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp index b4ef4ecc3fe28..96b0a7d634f7e 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp @@ -198,7 +198,7 @@ UniformityLLTOpPredicateID LLTToBId(LLT Ty) { return B32; if (Ty == LLT::scalar(64) || Ty == LLT::fixed_vector(2, 32) || Ty == LLT::fixed_vector(4, 16) || Ty == LLT::pointer(1, 64) || - Ty == LLT::pointer(4, 64)) + Ty == LLT::pointer(4, 64) || Ty == LLT::pointer(999, 64)) return B64; if (Ty == LLT::fixed_vector(3, 32)) return B96; @@ -485,8 +485,12 @@ RegBankLegalizeRules::RegBankLegalizeRules(const GCNSubtarget &_ST, addRulesForGOpcs({G_BR}).Any({{_}, {{}, {None}}}); addRulesForGOpcs({G_SELECT}, StandardB) + .Any({{DivS16}, {{Vgpr16}, {Vcc, Vgpr16, Vgpr16}}}) + .Any({{UniS16}, {{Sgpr16}, {Sgpr32AExtBoolInReg, Sgpr16, Sgpr16}}}) .Div(B32, {{VgprB32}, {Vcc, VgprB32, VgprB32}}) - .Uni(B32, {{SgprB32}, {Sgpr32AExtBoolInReg, SgprB32, SgprB32}}); + .Uni(B32, {{SgprB32}, {Sgpr32AExtBoolInReg, SgprB32, SgprB32}}) + .Div(B64, {{VgprB64}, {Vcc, VgprB64, VgprB64}, SplitTo32Sel}) + .Uni(B64, {{SgprB64}, {Sgpr32AExtBoolInReg, SgprB64, SgprB64}}); addRulesForGOpcs({G_ANYEXT}) .Any({{UniS16, S1}, {{None}, {None}}}) // should be combined away diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h index cdf70d99d4a9e..058e58c1a94ce 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h @@ -177,6 +177,7 @@ enum LoweringMethodID { Div_BFE, VgprToVccCopy, SplitTo32, + SplitTo32Sel, Ext32To64, UniCstExt, SplitLoad, diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir index 810724dab685d..762f7b9500367 100644 --- a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir +++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir @@ -1,6 +1,5 @@ # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py -# RUN: llc -mtriple=amdgcn -mcpu=fiji -run-pass=regbankselect -global-isel %s -verify-machineinstrs -o - -regbankselect-fast | FileCheck -check-prefix=FAST %s -# RUN: llc -mtriple=amdgcn -mcpu=fiji -run-pass=regbankselect -global-isel %s -verify-machineinstrs -o - -regban
[llvm-branch-commits] [clang] release/20.x: [modules] Handle friend function that was a definition but became only a declaration during AST deserialization (#132214) (PR #134232)
https://github.com/dmpolukhin created https://github.com/llvm/llvm-project/pull/134232 Fix for regression https://github.com/llvm/llvm-project/issues/130917, changes in https://github.com/llvm/llvm-project/pull/111992 were too broad. This change reduces scope of previous fix. Added `ExternalASTSource::wasThisDeclarationADefinition` to detect cases when FunctionDecl lost body due to declaration merges. >From 73ed00f5ef37fc19495bee13d0366fe093c5ac10 Mon Sep 17 00:00:00 2001 From: Dmitry Polukhin <34227995+dmpoluk...@users.noreply.github.com> Date: Thu, 3 Apr 2025 08:27:13 +0100 Subject: [PATCH 1/2] [modules] Handle friend function that was a definition but became only a declaration during AST deserialization (#132214) Fix for regression #130917, changes in #111992 were too broad. This change reduces scope of previous fix. Added `ExternalASTSource::wasThisDeclarationADefinition` to detect cases when FunctionDecl lost body due to declaration merges. --- clang/include/clang/AST/ExternalASTSource.h | 4 ++ .../clang/Sema/MultiplexExternalSemaSource.h | 2 + clang/include/clang/Serialization/ASTReader.h | 6 +++ clang/lib/AST/ExternalASTSource.cpp | 4 ++ .../lib/Sema/MultiplexExternalSemaSource.cpp | 8 .../lib/Sema/SemaTemplateInstantiateDecl.cpp | 12 +++--- clang/lib/Serialization/ASTReader.cpp | 4 ++ clang/lib/Serialization/ASTReaderDecl.cpp | 3 ++ .../friend-default-parameters-modules.cpp | 39 +++ .../SemaCXX/friend-default-parameters.cpp | 21 ++ 10 files changed, 98 insertions(+), 5 deletions(-) create mode 100644 clang/test/SemaCXX/friend-default-parameters-modules.cpp create mode 100644 clang/test/SemaCXX/friend-default-parameters.cpp diff --git a/clang/include/clang/AST/ExternalASTSource.h b/clang/include/clang/AST/ExternalASTSource.h index 42aed56d42e07..f45e3af7602c1 100644 --- a/clang/include/clang/AST/ExternalASTSource.h +++ b/clang/include/clang/AST/ExternalASTSource.h @@ -191,6 +191,10 @@ class ExternalASTSource : public RefCountedBase { virtual ExtKind hasExternalDefinitions(const Decl *D); + /// True if this function declaration was a definition before in its own + /// module. + virtual bool wasThisDeclarationADefinition(const FunctionDecl *FD); + /// Finds all declarations lexically contained within the given /// DeclContext, after applying an optional filter predicate. /// diff --git a/clang/include/clang/Sema/MultiplexExternalSemaSource.h b/clang/include/clang/Sema/MultiplexExternalSemaSource.h index 921bebe3a44af..391c2177d75ec 100644 --- a/clang/include/clang/Sema/MultiplexExternalSemaSource.h +++ b/clang/include/clang/Sema/MultiplexExternalSemaSource.h @@ -92,6 +92,8 @@ class MultiplexExternalSemaSource : public ExternalSemaSource { ExtKind hasExternalDefinitions(const Decl *D) override; + bool wasThisDeclarationADefinition(const FunctionDecl *FD) override; + /// Find all declarations with the given name in the /// given context. bool FindExternalVisibleDeclsByName(const DeclContext *DC, diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h index 47301419c76c6..23c98282f228f 100644 --- a/clang/include/clang/Serialization/ASTReader.h +++ b/clang/include/clang/Serialization/ASTReader.h @@ -1392,6 +1392,10 @@ class ASTReader llvm::DenseMap DefinitionSource; + /// Friend functions that were defined but might have had their bodies + /// removed. + llvm::DenseSet ThisDeclarationWasADefinitionSet; + bool shouldDisableValidationForFile(const serialization::ModuleFile &M) const; /// Reads a statement from the specified cursor. @@ -2375,6 +2379,8 @@ class ASTReader ExtKind hasExternalDefinitions(const Decl *D) override; + bool wasThisDeclarationADefinition(const FunctionDecl *FD) override; + /// Retrieve a selector from the given module with its local ID /// number. Selector getLocalSelector(ModuleFile &M, unsigned LocalID); diff --git a/clang/lib/AST/ExternalASTSource.cpp b/clang/lib/AST/ExternalASTSource.cpp index e2451f294741d..3e865cb7679b5 100644 --- a/clang/lib/AST/ExternalASTSource.cpp +++ b/clang/lib/AST/ExternalASTSource.cpp @@ -38,6 +38,10 @@ ExternalASTSource::hasExternalDefinitions(const Decl *D) { return EK_ReplyHazy; } +bool ExternalASTSource::wasThisDeclarationADefinition(const FunctionDecl *FD) { + return false; +} + void ExternalASTSource::FindFileRegionDecls(FileID File, unsigned Offset, unsigned Length, SmallVectorImpl &Decls) {} diff --git a/clang/lib/Sema/MultiplexExternalSemaSource.cpp b/clang/lib/Sema/MultiplexExternalSemaSource.cpp index 6d945300c386c..fbfb242598c24 100644 --- a/clang/lib/Sema/MultiplexExternalSemaSource.cpp +++ b/clang/lib/Sema/MultiplexExternalSemaSource.cpp @@ -107,6 +107,14 @@ MultiplexExternalSemaSource::hasExternalDefinitions(const Decl
[llvm-branch-commits] [clang] [clang][HeuristicResolver] Default argument heuristic for template parameters (PR #131074)
https://github.com/HighCommander4 updated https://github.com/llvm/llvm-project/pull/131074 >From 556926d2644160405958a5d01963714f97ab522e Mon Sep 17 00:00:00 2001 From: Nathan Ridge Date: Thu, 13 Mar 2025 01:23:03 -0400 Subject: [PATCH] [clang][HeuristicResolver] Default argument heuristic for template parameters --- clang/lib/Sema/HeuristicResolver.cpp | 17 ++ .../unittests/Sema/HeuristicResolverTest.cpp | 34 +++ 2 files changed, 51 insertions(+) diff --git a/clang/lib/Sema/HeuristicResolver.cpp b/clang/lib/Sema/HeuristicResolver.cpp index d377379c627db..7c88a3097a044 100644 --- a/clang/lib/Sema/HeuristicResolver.cpp +++ b/clang/lib/Sema/HeuristicResolver.cpp @@ -11,7 +11,9 @@ #include "clang/AST/CXXInheritance.h" #include "clang/AST/DeclTemplate.h" #include "clang/AST/ExprCXX.h" +#include "clang/AST/TemplateBase.h" #include "clang/AST/Type.h" +#include "llvm/Support/Casting.h" namespace clang { @@ -122,6 +124,7 @@ TemplateName getReferencedTemplateName(const Type *T) { // resolves it to a CXXRecordDecl in which we can try name lookup. TagDecl *HeuristicResolverImpl::resolveTypeToTagDecl(QualType QT) { const Type *T = QT.getTypePtrOrNull(); + if (!T) return nullptr; @@ -245,6 +248,20 @@ QualType HeuristicResolverImpl::simplifyType(QualType Type, const Expr *E, } } } +if (const auto *TTPT = dyn_cast_if_present(T.Type)) { + // We can't do much useful with a template parameter (e.g. we cannot look + // up member names inside it). However, if the template parameter has a + // default argument, as a heuristic we can replace T with the default + // argument type. + if (const auto *TTPD = TTPT->getDecl()) { +if (TTPD->hasDefaultArgument()) { + const auto &DefaultArg = TTPD->getDefaultArgument().getArgument(); + if (DefaultArg.getKind() == TemplateArgument::Type) { +return {DefaultArg.getAsType()}; + } +} + } +} return T; }; // As an additional protection against infinite loops, bound the number of diff --git a/clang/unittests/Sema/HeuristicResolverTest.cpp b/clang/unittests/Sema/HeuristicResolverTest.cpp index c7cfe7917c532..f7eb4b23c2ab0 100644 --- a/clang/unittests/Sema/HeuristicResolverTest.cpp +++ b/clang/unittests/Sema/HeuristicResolverTest.cpp @@ -410,6 +410,40 @@ TEST(HeuristicResolver, MemberExpr_HangIssue126536) { cxxDependentScopeMemberExpr(hasMemberName("foo")).bind("input")); } +TEST(HeuristicResolver, MemberExpr_DefaultTemplateArgument) { + std::string Code = R"cpp( +struct Default { + void foo(); +}; +template +void bar(T t) { + t.foo(); +} + )cpp"; + // Test resolution of "foo" in "t.foo()". + expectResolution( + Code, &HeuristicResolver::resolveMemberExpr, + cxxDependentScopeMemberExpr(hasMemberName("foo")).bind("input"), + cxxMethodDecl(hasName("foo")).bind("output")); +} + +TEST(HeuristicResolver, MemberExpr_DefaultTemplateArgument_Recursive) { + std::string Code = R"cpp( +struct Default { + void foo(); +}; +template +void bar(T t) { + t.foo(); +} + )cpp"; + // Test resolution of "foo" in "t.foo()". + expectResolution( + Code, &HeuristicResolver::resolveMemberExpr, + cxxDependentScopeMemberExpr(hasMemberName("foo")).bind("input"), + cxxMethodDecl(hasName("foo")).bind("output")); +} + TEST(HeuristicResolver, DeclRefExpr_StaticMethod) { std::string Code = R"cpp( template ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [clang-format] Allow `Language: Cpp` for C files (#133033) (PR #133216)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/133216 >From c1c4d7191d7078216b9c8793e46fff84a8c7a02d Mon Sep 17 00:00:00 2001 From: Owen Pan Date: Thu, 27 Mar 2025 01:00:02 -0700 Subject: [PATCH] [clang-format] Allow `Language: Cpp` for C files (#133033) Fix #132832 (cherry picked from commit 05fb8408de23c3ccb6125b6886742177755bd757) --- clang/lib/Format/Format.cpp| 18 ++ clang/unittests/Format/ConfigParseTest.cpp | 20 2 files changed, 34 insertions(+), 4 deletions(-) diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp index 0bb8545884442..768e655f65ce7 100644 --- a/clang/lib/Format/Format.cpp +++ b/clang/lib/Format/Format.cpp @@ -2114,10 +2114,14 @@ std::error_code parseConfiguration(llvm::MemoryBufferRef Config, FormatStyle::FormatStyleSet StyleSet; bool LanguageFound = false; for (const FormatStyle &Style : llvm::reverse(Styles)) { -if (Style.Language != FormatStyle::LK_None) +const auto Lang = Style.Language; +if (Lang != FormatStyle::LK_None) StyleSet.Add(Style); -if (Style.Language == Language) +if (Lang == Language || +// For backward compatibility. +(Lang == FormatStyle::LK_Cpp && Language == FormatStyle::LK_C)) { LanguageFound = true; +} } if (!LanguageFound) { if (Styles.empty() || Styles[0].Language != FormatStyle::LK_None) @@ -2157,8 +2161,14 @@ FormatStyle::FormatStyleSet::Get(FormatStyle::LanguageKind Language) const { if (!Styles) return std::nullopt; auto It = Styles->find(Language); - if (It == Styles->end()) -return std::nullopt; + if (It == Styles->end()) { +if (Language != FormatStyle::LK_C) + return std::nullopt; +// For backward compatibility. +It = Styles->find(FormatStyle::LK_Cpp); +if (It == Styles->end()) + return std::nullopt; + } FormatStyle Style = It->second; Style.StyleSet = *this; return Style; diff --git a/clang/unittests/Format/ConfigParseTest.cpp b/clang/unittests/Format/ConfigParseTest.cpp index 10788449a1a1d..fcf07e660ddb6 100644 --- a/clang/unittests/Format/ConfigParseTest.cpp +++ b/clang/unittests/Format/ConfigParseTest.cpp @@ -1214,6 +1214,26 @@ TEST(ConfigParseTest, ParsesConfigurationWithLanguages) { IndentWidth, 56u); } +TEST(ConfigParseTest, AllowCppForC) { + FormatStyle Style = {}; + Style.Language = FormatStyle::LK_C; + EXPECT_EQ(parseConfiguration("Language: Cpp", &Style), ParseError::Success); + + CHECK_PARSE("---\n" + "IndentWidth: 4\n" + "---\n" + "Language: Cpp\n" + "IndentWidth: 8\n", + IndentWidth, 8u); + + EXPECT_EQ(parseConfiguration("---\n" + "Language: ObjC\n" + "---\n" + "Language: Cpp\n", + &Style), +ParseError::Success); +} + TEST(ConfigParseTest, UsesLanguageForBasedOnStyle) { FormatStyle Style = {}; Style.Language = FormatStyle::LK_JavaScript; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/20.x: [Hexagon] Set the default compilation target to V68 (#125239) (PR #128597)
iajbar wrote: > > Given 20.1.1 was just released, is the plan still to get this one into > > 20.x? (Just asking to know whether we should make a corresponding change in > > Zig.) > > @quic-akaryaki @iajbar - can we get this/these changes in? Yes, please. https://github.com/llvm/llvm-project/pull/128597 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
redstar wrote: I implemented the suggestion from @uweigand. The GOFF attributes are set directly at the `MCSectionGOFF`, and the `GOFFSymbolMapper` is gone. I still need to update a couple of tests, since now the section names have changed. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] c1c4d71 - [clang-format] Allow `Language: Cpp` for C files (#133033)
Author: Owen Pan Date: 2025-03-28T23:14:52-07:00 New Revision: c1c4d7191d7078216b9c8793e46fff84a8c7a02d URL: https://github.com/llvm/llvm-project/commit/c1c4d7191d7078216b9c8793e46fff84a8c7a02d DIFF: https://github.com/llvm/llvm-project/commit/c1c4d7191d7078216b9c8793e46fff84a8c7a02d.diff LOG: [clang-format] Allow `Language: Cpp` for C files (#133033) Fix #132832 (cherry picked from commit 05fb8408de23c3ccb6125b6886742177755bd757) Added: Modified: clang/lib/Format/Format.cpp clang/unittests/Format/ConfigParseTest.cpp Removed: diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp index 0bb8545884442..768e655f65ce7 100644 --- a/clang/lib/Format/Format.cpp +++ b/clang/lib/Format/Format.cpp @@ -2114,10 +2114,14 @@ std::error_code parseConfiguration(llvm::MemoryBufferRef Config, FormatStyle::FormatStyleSet StyleSet; bool LanguageFound = false; for (const FormatStyle &Style : llvm::reverse(Styles)) { -if (Style.Language != FormatStyle::LK_None) +const auto Lang = Style.Language; +if (Lang != FormatStyle::LK_None) StyleSet.Add(Style); -if (Style.Language == Language) +if (Lang == Language || +// For backward compatibility. +(Lang == FormatStyle::LK_Cpp && Language == FormatStyle::LK_C)) { LanguageFound = true; +} } if (!LanguageFound) { if (Styles.empty() || Styles[0].Language != FormatStyle::LK_None) @@ -2157,8 +2161,14 @@ FormatStyle::FormatStyleSet::Get(FormatStyle::LanguageKind Language) const { if (!Styles) return std::nullopt; auto It = Styles->find(Language); - if (It == Styles->end()) -return std::nullopt; + if (It == Styles->end()) { +if (Language != FormatStyle::LK_C) + return std::nullopt; +// For backward compatibility. +It = Styles->find(FormatStyle::LK_Cpp); +if (It == Styles->end()) + return std::nullopt; + } FormatStyle Style = It->second; Style.StyleSet = *this; return Style; diff --git a/clang/unittests/Format/ConfigParseTest.cpp b/clang/unittests/Format/ConfigParseTest.cpp index 10788449a1a1d..fcf07e660ddb6 100644 --- a/clang/unittests/Format/ConfigParseTest.cpp +++ b/clang/unittests/Format/ConfigParseTest.cpp @@ -1214,6 +1214,26 @@ TEST(ConfigParseTest, ParsesConfigurationWithLanguages) { IndentWidth, 56u); } +TEST(ConfigParseTest, AllowCppForC) { + FormatStyle Style = {}; + Style.Language = FormatStyle::LK_C; + EXPECT_EQ(parseConfiguration("Language: Cpp", &Style), ParseError::Success); + + CHECK_PARSE("---\n" + "IndentWidth: 4\n" + "---\n" + "Language: Cpp\n" + "IndentWidth: 8\n", + IndentWidth, 8u); + + EXPECT_EQ(parseConfiguration("---\n" + "Language: ObjC\n" + "---\n" + "Language: Cpp\n", + &Style), +ParseError::Success); +} + TEST(ConfigParseTest, UsesLanguageForBasedOnStyle) { FormatStyle Style = {}; Style.Language = FormatStyle::LK_JavaScript; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [modules] Handle friend function that was a definition but became only a declaration during AST deserialization (#132214) (PR #134232)
llvmbot wrote: @llvm/pr-subscribers-clang-modules Author: Dmitry Polukhin (dmpolukhin) Changes Fix for regression https://github.com/llvm/llvm-project/issues/130917, changes in https://github.com/llvm/llvm-project/pull/111992 were too broad. This change reduces scope of previous fix. Added `ExternalASTSource::wasThisDeclarationADefinition` to detect cases when FunctionDecl lost body due to declaration merges. --- Full diff: https://github.com/llvm/llvm-project/pull/134232.diff 11 Files Affected: - (modified) clang/docs/ReleaseNotes.rst (+1) - (modified) clang/include/clang/AST/ExternalASTSource.h (+4) - (modified) clang/include/clang/Sema/MultiplexExternalSemaSource.h (+2) - (modified) clang/include/clang/Serialization/ASTReader.h (+6) - (modified) clang/lib/AST/ExternalASTSource.cpp (+4) - (modified) clang/lib/Sema/MultiplexExternalSemaSource.cpp (+8) - (modified) clang/lib/Sema/SemaTemplateInstantiateDecl.cpp (+7-5) - (modified) clang/lib/Serialization/ASTReader.cpp (+4) - (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+3) - (added) clang/test/SemaCXX/friend-default-parameters-modules.cpp (+39) - (added) clang/test/SemaCXX/friend-default-parameters.cpp (+21) ``diff diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index f4befc242f28b..e57fa9786e6f2 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1065,6 +1065,7 @@ Bug Fixes to C++ Support - Fixed an incorrect pointer access when checking access-control on concepts. (#GH131530) - Fixed various alias CTAD bugs involving variadic template arguments. (#GH123591), (#GH127539), (#GH129077), (#GH129620), and (#GH129998). +- Fixed the false compilation error "redefinition of default argument" for friend functions with default parameters. (#GH130917) Bug Fixes to AST Handling ^ diff --git a/clang/include/clang/AST/ExternalASTSource.h b/clang/include/clang/AST/ExternalASTSource.h index 42aed56d42e07..f45e3af7602c1 100644 --- a/clang/include/clang/AST/ExternalASTSource.h +++ b/clang/include/clang/AST/ExternalASTSource.h @@ -191,6 +191,10 @@ class ExternalASTSource : public RefCountedBase { virtual ExtKind hasExternalDefinitions(const Decl *D); + /// True if this function declaration was a definition before in its own + /// module. + virtual bool wasThisDeclarationADefinition(const FunctionDecl *FD); + /// Finds all declarations lexically contained within the given /// DeclContext, after applying an optional filter predicate. /// diff --git a/clang/include/clang/Sema/MultiplexExternalSemaSource.h b/clang/include/clang/Sema/MultiplexExternalSemaSource.h index 921bebe3a44af..391c2177d75ec 100644 --- a/clang/include/clang/Sema/MultiplexExternalSemaSource.h +++ b/clang/include/clang/Sema/MultiplexExternalSemaSource.h @@ -92,6 +92,8 @@ class MultiplexExternalSemaSource : public ExternalSemaSource { ExtKind hasExternalDefinitions(const Decl *D) override; + bool wasThisDeclarationADefinition(const FunctionDecl *FD) override; + /// Find all declarations with the given name in the /// given context. bool FindExternalVisibleDeclsByName(const DeclContext *DC, diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h index 47301419c76c6..23c98282f228f 100644 --- a/clang/include/clang/Serialization/ASTReader.h +++ b/clang/include/clang/Serialization/ASTReader.h @@ -1392,6 +1392,10 @@ class ASTReader llvm::DenseMap DefinitionSource; + /// Friend functions that were defined but might have had their bodies + /// removed. + llvm::DenseSet ThisDeclarationWasADefinitionSet; + bool shouldDisableValidationForFile(const serialization::ModuleFile &M) const; /// Reads a statement from the specified cursor. @@ -2375,6 +2379,8 @@ class ASTReader ExtKind hasExternalDefinitions(const Decl *D) override; + bool wasThisDeclarationADefinition(const FunctionDecl *FD) override; + /// Retrieve a selector from the given module with its local ID /// number. Selector getLocalSelector(ModuleFile &M, unsigned LocalID); diff --git a/clang/lib/AST/ExternalASTSource.cpp b/clang/lib/AST/ExternalASTSource.cpp index e2451f294741d..3e865cb7679b5 100644 --- a/clang/lib/AST/ExternalASTSource.cpp +++ b/clang/lib/AST/ExternalASTSource.cpp @@ -38,6 +38,10 @@ ExternalASTSource::hasExternalDefinitions(const Decl *D) { return EK_ReplyHazy; } +bool ExternalASTSource::wasThisDeclarationADefinition(const FunctionDecl *FD) { + return false; +} + void ExternalASTSource::FindFileRegionDecls(FileID File, unsigned Offset, unsigned Length, SmallVectorImpl &Decls) {} diff --git a/clang/lib/Sema/MultiplexExternalSemaSource.cpp b/clang/lib/Sema/MultiplexExternalSemaSource.cpp index 6d945300c386c..fbfb242598c24 100644 --- a/clang/lib/Sema/MultiplexExternalSemaSource.cpp +++ b/clang
[llvm-branch-commits] [BOLT][NFC] Pre-disasm metadata rewriters (PR #132113)
https://github.com/aaupov ready_for_review https://github.com/llvm/llvm-project/pull/132113 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Support image_bvh8_intersect_ray instruction and intrinsic. (PR #130041)
@@ -1509,18 +1509,18 @@ multiclass MIMG_Gather : MIMG_Gather; -class MIMG_IntersectRay_Helper { - int num_addrs = !if(Is64, !if(IsA16, 9, 12), !if(IsA16, 8, 11)); +class MIMG_IntersectRay_Helper { + int num_addrs = !if(isBVH8, 11, !if(Is64, !if(IsA16, 9, 12), !if(IsA16, 8, 11))); RegisterClass RegClass = MIMGAddrSize.RegClass; int VAddrDwords = !srl(RegClass.Size, 5); int GFX11PlusNSAAddrs = !if(IsA16, 4, 5); RegisterClass node_ptr_type = !if(Is64, VReg_64, VGPR_32); list GFX11PlusAddrTypes = -!if(isDual, [VReg_64, VReg_64, VReg_96, VReg_96, VReg_64], - !if(IsA16, - [node_ptr_type, VGPR_32, VReg_96, VReg_96], - [node_ptr_type, VGPR_32, VReg_96, VReg_96, VReg_96])); + !cond(!eq(isBVH8, 1) : [node_ptr_type, VReg_64, VReg_96, VReg_96, VGPR_32], + !eq(isDual, 1) : [node_ptr_type, VReg_64, VReg_96, VReg_96, VReg_64], + !eq(IsA16, 0) : [node_ptr_type, VGPR_32, VReg_96, VReg_96, VReg_96], + !eq(IsA16, 1) : [node_ptr_type, VGPR_32, VReg_96, VReg_96]); mbrkusanin wrote: ```suggestion !cond(isBVH8 : [node_ptr_type, VReg_64, VReg_96, VReg_96, VGPR_32], isDual : [node_ptr_type, VReg_64, VReg_96, VReg_96, VReg_64], IsA16 : [node_ptr_type, VGPR_32, VReg_96, VReg_96], true : [node_ptr_type, VGPR_32, VReg_96, VReg_96, VReg_96]); ``` !eq(X, 1) is redundant here, and last two options can be swapped https://github.com/llvm/llvm-project/pull/130041 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GlobalISel] Combine redundant sext_inreg (PR #131624)
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/131624 >From 3f3c67934d0c9ea34c11cbd24becc24541baf567 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Mon, 17 Mar 2025 13:54:59 +0100 Subject: [PATCH 1/2] [GlobalISel] Combine redundant sext_inreg --- .../llvm/CodeGen/GlobalISel/CombinerHelper.h | 3 + .../include/llvm/Target/GlobalISel/Combine.td | 9 +- .../GlobalISel/CombinerHelperCasts.cpp| 27 +++ .../combine-redundant-sext-inreg.mir | 164 ++ .../combine-sext-trunc-sextinreg.mir | 87 ++ .../CodeGen/AMDGPU/GlobalISel/llvm.abs.ll | 5 - 6 files changed, 289 insertions(+), 6 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir create mode 100644 llvm/test/CodeGen/AMDGPU/GlobalISel/combine-sext-trunc-sextinreg.mir diff --git a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h index 9b78342c8fc39..5778377d125a8 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h @@ -994,6 +994,9 @@ class CombinerHelper { // overflow sub bool matchSuboCarryOut(const MachineInstr &MI, BuildFnTy &MatchInfo) const; + // (sext_inreg (sext_inreg x, K0), K1) + void applyRedundantSextInReg(MachineInstr &Root, MachineInstr &Other) const; + private: /// Checks for legality of an indexed variant of \p LdSt. bool isIndexedLoadStoreLegal(GLoadStore &LdSt) const; diff --git a/llvm/include/llvm/Target/GlobalISel/Combine.td b/llvm/include/llvm/Target/GlobalISel/Combine.td index 660b03080f92e..6a0ff683a4647 100644 --- a/llvm/include/llvm/Target/GlobalISel/Combine.td +++ b/llvm/include/llvm/Target/GlobalISel/Combine.td @@ -1849,6 +1849,12 @@ def anyext_of_anyext : ext_of_ext_opcodes; def anyext_of_zext : ext_of_ext_opcodes; def anyext_of_sext : ext_of_ext_opcodes; +def sext_inreg_of_sext_inreg : GICombineRule< + (defs root:$dst), + (match (G_SEXT_INREG $x, $src, $a):$other, + (G_SEXT_INREG $dst, $x, $b):$root), + (apply [{ Helper.applyRedundantSextInReg(*${root}, *${other}); }])>; + // Push cast through build vector. class buildvector_of_opcode : GICombineRule < (defs root:$root, build_fn_matchinfo:$matchinfo), @@ -1896,7 +1902,8 @@ def cast_of_cast_combines: GICombineGroup<[ sext_of_anyext, anyext_of_anyext, anyext_of_zext, - anyext_of_sext + anyext_of_sext, + sext_inreg_of_sext_inreg, ]>; def cast_combines: GICombineGroup<[ diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp index 576fd5fd81703..883a62c308232 100644 --- a/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp +++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp @@ -378,3 +378,30 @@ bool CombinerHelper::matchCastOfInteger(const MachineInstr &CastMI, return false; } } + +void CombinerHelper::applyRedundantSextInReg(MachineInstr &Root, + MachineInstr &Other) const { + assert(Root.getOpcode() == TargetOpcode::G_SEXT_INREG && + Other.getOpcode() == TargetOpcode::G_SEXT_INREG); + + unsigned RootWidth = Root.getOperand(2).getImm(); + unsigned OtherWidth = Other.getOperand(2).getImm(); + + Register Dst = Root.getOperand(0).getReg(); + Register OtherDst = Other.getOperand(0).getReg(); + Register Src = Other.getOperand(1).getReg(); + + if (RootWidth >= OtherWidth) { +// The root sext_inreg is entirely redundant because the other one +// is narrower. +Observer.changingAllUsesOfReg(MRI, Dst); +MRI.replaceRegWith(Dst, OtherDst); +Observer.finishedChangingAllUsesOfReg(); + } else { +// RootWidth < OtherWidth, rewrite this G_SEXT_INREG with the source of the +// other G_SEXT_INREG. +Builder.buildSExtInReg(Dst, Src, RootWidth); + } + + Root.eraseFromParent(); +} diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir new file mode 100644 index 0..566ee8e6c338d --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir @@ -0,0 +1,164 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 -run-pass=amdgpu-regbank-combiner -verify-machineinstrs %s -o - | FileCheck %s + +--- +name: inreg8_inreg16 +tracksRegLiveness: true +body: | + bb.0: +liveins: $vgpr0 +; CHECK-LABEL: name: inreg8_inreg16 +; CHECK: liveins: $vgpr0 +; CHECK-NEXT: {{ $}} +; CHECK-NEXT: %copy:_(s32) = COPY $vgpr0 +; CHECK-NEXT: %inreg:_(s32) = G_SEXT_INREG %copy, 8 +; CHECK-NEXT: $vgpr0 = COPY %inreg(s32) +%copy:_(s32) = COPY $vgpr0 +%inreg:_(s32) = G_SEXT_INREG %copy, 8 +%inreg1:_(s32) = G_SEXT_INREG %inreg, 16 +$vgpr0 = COPY %inreg1 +... + +
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor analysis of RET instructions (PR #131897)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/131897 >From 136dc3d8728a3511bd524d416059c289f0118100 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 17 Mar 2025 19:28:25 +0300 Subject: [PATCH 1/2] [BOLT] Gadget scanner: refactor analysis of RET instructions In preparation for implementing detection of more gadget kinds, refactor checking for non-protected return instructions. --- bolt/include/bolt/Passes/PAuthGadgetScanner.h | 23 ++- bolt/lib/Passes/PAuthGadgetScanner.cpp| 138 ++ 2 files changed, 95 insertions(+), 66 deletions(-) diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index 2d8109f8ca43b..f102f1080e2e8 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -199,19 +199,34 @@ struct Report { virtual void generateReport(raw_ostream &OS, const BinaryContext &BC) const = 0; + virtual const ArrayRef getAffectedRegisters() const { return {}; } + virtual void + setOverwritingInstrs(const std::vector &Instrs) {} + void printBasicInfo(raw_ostream &OS, const BinaryContext &BC, StringRef IssueKind) const; }; struct GadgetReport : public Report { const GadgetKind &Kind; + SmallVector AffectedRegisters; std::vector OverwritingInstrs; GadgetReport(const GadgetKind &Kind, MCInstReference Location, - std::vector OverwritingInstrs) - : Report(Location), Kind(Kind), OverwritingInstrs(OverwritingInstrs) {} + const BitVector &AffectedRegisters) + : Report(Location), Kind(Kind), +AffectedRegisters(AffectedRegisters.set_bits()) {} void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; + + const ArrayRef getAffectedRegisters() const override { +return AffectedRegisters; + } + + void + setOverwritingInstrs(const std::vector &Instrs) override { +OverwritingInstrs = Instrs; + } }; /// Report with a free-form message attached. @@ -224,7 +239,6 @@ struct GenericReport : public Report { }; struct FunctionAnalysisResult { - SmallSet RegistersAffected; std::vector> Diagnostics; }; @@ -232,8 +246,7 @@ class Analysis : public BinaryFunctionPass { void runOnFunction(BinaryFunction &Function, MCPlusBuilder::AllocatorIdTy AllocatorId); FunctionAnalysisResult - computeDfState(PacRetAnalysis &PRA, BinaryFunction &BF, - MCPlusBuilder::AllocatorIdTy AllocatorId); + computeDfState(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId); std::map AnalysisResults; std::mutex AnalysisResultsMutex; diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index f71866cd07548..14236e85e9c7b 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -353,7 +353,7 @@ class PacRetAnalysis public: std::vector getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF, - const BitVector &UsedDirtyRegs) const { + const ArrayRef UsedDirtyRegs) const { if (RegsToTrackInstsFor.empty()) return {}; auto MaybeState = getStateAt(Ret); @@ -362,7 +362,7 @@ class PacRetAnalysis const State &S = *MaybeState; // Due to aliasing registers, multiple registers may have been tracked. std::set LastWritingInsts; -for (MCPhysReg TrackedReg : UsedDirtyRegs.set_bits()) { +for (MCPhysReg TrackedReg : UsedDirtyRegs) { for (const MCInst *Inst : lastWritingInsts(S, TrackedReg)) LastWritingInsts.insert(Inst); } @@ -376,57 +376,81 @@ class PacRetAnalysis } }; +static std::shared_ptr tryCheckReturn(const BinaryContext &BC, + const MCInstReference &Inst, + const State &S) { + static const GadgetKind RetKind("non-protected ret found"); + if (!BC.MIB->isReturn(Inst)) +return nullptr; + + ErrorOr MaybeRetReg = BC.MIB->getRegUsedAsRetDest(Inst); + if (MaybeRetReg.getError()) { +return std::make_shared( +Inst, "Warning: pac-ret analysis could not analyze this return " + "instruction"); + } + MCPhysReg RetReg = *MaybeRetReg; + LLVM_DEBUG({ +traceInst(BC, "Found RET inst", Inst); +traceReg(BC, "RetReg", RetReg); +traceReg(BC, "Authenticated reg", BC.MIB->getAuthenticatedReg(Inst)); + }); + if (BC.MIB->isAuthenticationOfReg(Inst, RetReg)) +return nullptr; + BitVector UsedDirtyRegs = S.NonAutClobRegs; + LLVM_DEBUG({ traceRegMask(BC, "NonAutClobRegs at Ret", UsedDirtyRegs); }); + UsedDirtyRegs &= BC.MIB->getAliases(RetReg, /*OnlySmaller=*/true); + LLVM_DEBUG({ traceRegMask(BC, "Intersection with RetReg", UsedDirtyRegs); }); + if (!UsedDirtyRegs.any()) +return nullptr; + + return std::make_shared(RetKind,
[llvm-branch-commits] [clang] [llvm] [AMDGPU][Attributor] Rework update of `AAAMDWavesPerEU` (PR #123995)
@@ -1336,6 +1311,59 @@ static void addPreloadKernArgHint(Function &F, TargetMachine &TM) { } } +static void checkWavesPerEU(Module &M, TargetMachine &TM) { + for (Function &F : M) { +const GCNSubtarget &ST = TM.getSubtarget(F); + +auto FlatWgrpSizeAttr = +AMDGPU::getIntegerPairAttribute(F, "amdgpu-flat-work-group-size"); +auto WavesPerEUAttr = AMDGPU::getIntegerPairAttribute( +F, "amdgpu-waves-per-eu", /*OnlyFirstRequired=*/true); + +unsigned MinWavesPerEU = ST.getMinWavesPerEU(); +unsigned MaxWavesPerEU = ST.getMaxWavesPerEU(); + +unsigned MinFlatWgrpSize = 1U; +unsigned MaxFlatWgrpSize = 1024U; +if (FlatWgrpSizeAttr.has_value()) { + MinFlatWgrpSize = FlatWgrpSizeAttr->first; + MaxFlatWgrpSize = *(FlatWgrpSizeAttr->second); +} arsenm wrote: ```suggestion if (FlatWgrpSizeAttr) std::tie(MinFlatWgrpSize, MaxFlatWgrpSize) = *FlatWgrpSizeAttr; ``` https://github.com/llvm/llvm-project/pull/123995 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Exclude docs directories from triggering rebuilds (PR #133185)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/133185 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] release/20.x: [libcxx] [test] Fix restoring LLVM_DIR and Clang_DIR (#132838) (PR #133153)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/133153 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -169,6 +169,91 @@ enum SubsectionKind : uint8_t { SK_PPA1 = 2, SK_PPA2 = 4, }; + +// The standard System/390 convention is to name the high-order (leftmost) bit +// in a byte as bit zero. The Flags type helps to set bits in byte according +// to this numeration order. +class Flags { + uint8_t Val; + + constexpr static uint8_t bits(uint8_t BitIndex, uint8_t Length, uint8_t Value, +uint8_t OldValue) { +uint8_t Pos = 8 - BitIndex - Length; +uint8_t Mask = ((1 << Length) - 1) << Pos; +Value = Value << Pos; +return (OldValue & ~Mask) | Value; + } + +public: + constexpr Flags() : Val(0) {} + constexpr Flags(uint8_t BitIndex, uint8_t Length, uint8_t Value) + : Val(bits(BitIndex, Length, Value, 0)) {} + + template + constexpr void set(uint8_t BitIndex, uint8_t Length, T NewValue) { +Val = bits(BitIndex, Length, static_cast(NewValue), Val); + } + + template + constexpr T get(uint8_t BitIndex, uint8_t Length) const { +return static_cast((Val >> (8 - BitIndex - Length)) & + ((1 << Length) - 1)); + } + + constexpr operator uint8_t() const { return Val; } +}; + +// Structure for the flag field of a symbol. See +// https://www.ibm.com/docs/en/zos/3.1.0?topic=formats-external-symbol-definition-record, +// offset 41, for the definition. +struct SymbolFlags { + Flags SymFlags; + +#define GOFF_SYMBOL_FLAG(NAME, TYPE, BITINDEX, LENGTH) \ + void set##NAME(TYPE Val) { SymFlags.set(BITINDEX, LENGTH, Val); } \ + TYPE get##NAME() const { return SymFlags.get(BITINDEX, LENGTH); } + + GOFF_SYMBOL_FLAG(FillBytePresence, bool, 0, 1) + GOFF_SYMBOL_FLAG(Mangled, bool, 1, 1) + GOFF_SYMBOL_FLAG(Renameable, bool, 2, 1) + GOFF_SYMBOL_FLAG(RemovableClass, bool, 3, 1) + GOFF_SYMBOL_FLAG(ReservedQwords, ESDReserveQwords, 5, 3) + +#undef GOFF_SYMBOL_FLAG + +constexpr operator uint8_t() const { return static_cast(SymFlags); } redstar wrote: Changed. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [LoongArch] Pre-commit test for fixing tls-le symbol type (PR #132361)
https://github.com/SixWeining approved this pull request. https://github.com/llvm/llvm-project/pull/132361 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for select (PR #132384)
https://github.com/petar-avramovic ready_for_review https://github.com/llvm/llvm-project/pull/132384 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [X86] When expanding LCMPXCHG16B_SAVE_RBX, substitute RBX in base (#134109) (PR #134331)
aaronpuchert wrote: You might have to (formally) approve the changes. https://github.com/llvm/llvm-project/pull/134331 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] llvm-reduce: Try to preserve instruction metadata as argument attributes (PR #133557)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/133557 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [ctxprof] Support for "move" semantics for the contextual root (PR #134192)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/134192 >From f9b3bfa82d671dc4f67001762e28bb57ea154ebf Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Wed, 2 Apr 2025 18:39:14 -0700 Subject: [PATCH] [ctxprof] Support for "move" semantics for the contextual root --- .../Transforms/Utils/FunctionImportUtils.h| 25 llvm/lib/Transforms/IPO/FunctionImport.cpp| 18 .../Transforms/Utils/FunctionImportUtils.cpp | 29 ++- .../ThinLTO/X86/ctxprof-separate-module.ll| 22 -- 4 files changed, 70 insertions(+), 24 deletions(-) diff --git a/llvm/include/llvm/Transforms/Utils/FunctionImportUtils.h b/llvm/include/llvm/Transforms/Utils/FunctionImportUtils.h index 6d83b615d5f13..28ba20bc18cf9 100644 --- a/llvm/include/llvm/Transforms/Utils/FunctionImportUtils.h +++ b/llvm/include/llvm/Transforms/Utils/FunctionImportUtils.h @@ -97,29 +97,14 @@ class FunctionImportGlobalProcessing { /// linkage for a required promotion of a local to global scope. GlobalValue::LinkageTypes getLinkage(const GlobalValue *SGV, bool DoPromote); + /// The symbols with these names are moved to a different module and should be + /// promoted to external linkage where they are defined. + DenseSet SymbolsToMove; + public: FunctionImportGlobalProcessing(Module &M, const ModuleSummaryIndex &Index, SetVector *GlobalsToImport, - bool ClearDSOLocalOnDeclarations) - : M(M), ImportIndex(Index), GlobalsToImport(GlobalsToImport), -ClearDSOLocalOnDeclarations(ClearDSOLocalOnDeclarations) { -// If we have a ModuleSummaryIndex but no function to import, -// then this is the primary module being compiled in a ThinLTO -// backend compilation, and we need to see if it has functions that -// may be exported to another backend compilation. -if (!GlobalsToImport) - HasExportedFunctions = ImportIndex.hasExportedFunctions(M); - -#ifndef NDEBUG -SmallVector Vec; -// First collect those in the llvm.used set. -collectUsedGlobalVariables(M, Vec, /*CompilerUsed=*/false); -// Next collect those in the llvm.compiler.used set. -collectUsedGlobalVariables(M, Vec, /*CompilerUsed=*/true); -Used = {llvm::from_range, Vec}; -#endif - } - + bool ClearDSOLocalOnDeclarations); void run(); }; diff --git a/llvm/lib/Transforms/IPO/FunctionImport.cpp b/llvm/lib/Transforms/IPO/FunctionImport.cpp index 3d9fb7b12b5d5..50100a63cf407 100644 --- a/llvm/lib/Transforms/IPO/FunctionImport.cpp +++ b/llvm/lib/Transforms/IPO/FunctionImport.cpp @@ -182,6 +182,15 @@ static cl::opt CtxprofMoveRootsToOwnModule( "their own module."), cl::Hidden, cl::init(false)); +cl::list MoveSymbolGUID( +"thinlto-move-symbols", +cl::desc( +"Move the symbols with the given name. This will delete these symbols " +"wherever they are originally defined, and make sure their " +"linkage is External where they are imported. It is meant to be " +"used with the name of contextual profiling roots."), +cl::Hidden); + namespace llvm { extern cl::opt EnableMemProfContextDisambiguation; } @@ -1858,6 +1867,15 @@ Expected FunctionImporter::importFunctions( LLVM_DEBUG(dbgs() << "Starting import for Module " << DestModule.getModuleIdentifier() << "\n"); unsigned ImportedCount = 0, ImportedGVCount = 0; + // Before carrying out any imports, see if this module defines functions in + // MoveSymbolGUID. If it does, delete them here (but leave the declaration). + // The function will be imported elsewhere, as extenal linkage, and the + // destination doesn't yet have its definition. + DenseSet MoveSymbolGUIDSet; + MoveSymbolGUIDSet.insert_range(MoveSymbolGUID); + for (auto &F : DestModule) +if (!F.isDeclaration() && MoveSymbolGUIDSet.contains(F.getGUID())) + F.deleteBody(); IRMover Mover(DestModule); diff --git a/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp b/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp index ae1af943bc11c..81e461e28df17 100644 --- a/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp +++ b/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp @@ -24,6 +24,31 @@ static cl::opt UseSourceFilenameForPromotedLocals( "This requires that the source filename has a unique name / " "path to avoid name collisions.")); +extern cl::list MoveSymbolGUID; + +FunctionImportGlobalProcessing::FunctionImportGlobalProcessing( +Module &M, const ModuleSummaryIndex &Index, +SetVector *GlobalsToImport, bool ClearDSOLocalOnDeclarations) +: M(M), ImportIndex(Index), GlobalsToImport(GlobalsToImport), + ClearDSOLocalOnDeclarations(ClearDSOLocalOnDeclarations) { + // If we have a ModuleSummaryIndex but no function to import, + // then this is the primary module being compiled in a ThinLTO + // backen
[llvm-branch-commits] [compiler-rt] [llvm] [ctxprof] Track unhandled call targets (PR #131417)
@@ -265,7 +275,16 @@ Error llvm::createCtxProfFromYAML(StringRef Profile, raw_ostream &Out) { if (!TopList) return createStringError( "Unexpected error converting internal structure to ctx profile"); - Writer.writeContextual(*TopList, DC.TotalRootEntryCount); + + ctx_profile::ContextNode *FirstUnhandled = nullptr; + for (const auto &U : DC.Unhandled) { +SerializableCtxRepresentation Unhandled; +Unhandled.Guid = U.first; +Unhandled.Counters.insert(Unhandled.Counters.begin(), U.second.begin(), mtrofin wrote: wdym. this copies the counter values from `U.second` to `Unhandled`. `insert` with a first/last iterator pair will - afaik - allocate the diff (last - first) (caveats about difference being computable, but the source is also a vector). Also why would I reverse? There *is* a more efficient alternative - having a `createNode` that takes the guid and the counters separately - but not quite worth it for the yaml converter, it's a test utility. https://github.com/llvm/llvm-project/pull/131417 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/20.x: [lldb] Use correct path for lldb-server executable (#131519) (PR #134072)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/134072 Backport 945c494e2c3c078e26ff521ef3e9455e0ff764ac Requested by: @DavidSpickett >From c8c12d84c18a6ebd10151c2e354002a8b6642af3 Mon Sep 17 00:00:00 2001 From: Yuval Deutscher Date: Mon, 31 Mar 2025 18:20:40 +0300 Subject: [PATCH] [lldb] Use correct path for lldb-server executable (#131519) Hey, This solves an issue where running lldb-server-20 with a non-absolute path (for example, when it's installed into `/usr/bin` and the user runs it as `lldb-server-20 ...` and not `/usr/bin/lldb-server-20 ...`) fails with `error: spawn_process failed: execve failed: No such file or directory`. The underlying issue is that when run that way, it attempts to execute a binary named `lldb-server-20` from its current directory. This is also a mild security hazard because lldb-server is often being run as root in the directory /tmp, meaning that an unprivileged user can create the file /tmp/lldb-server-20 and lldb-server will execute it as root. (although, well, it's a debugging server we're talking about, so that may not be a real concern) I haven't previously contributed to this project; if you want me to change anything in the code please don't hesitate to let me know. (cherry picked from commit 945c494e2c3c078e26ff521ef3e9455e0ff764ac) --- lldb/tools/lldb-server/lldb-platform.cpp | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/lldb/tools/lldb-server/lldb-platform.cpp b/lldb/tools/lldb-server/lldb-platform.cpp index 880b45b989b9c..51174a0f443c3 100644 --- a/lldb/tools/lldb-server/lldb-platform.cpp +++ b/lldb/tools/lldb-server/lldb-platform.cpp @@ -31,6 +31,7 @@ #include "Plugins/Process/gdb-remote/ProcessGDBRemoteLog.h" #include "lldb/Host/ConnectionFileDescriptor.h" #include "lldb/Host/HostGetOpt.h" +#include "lldb/Host/HostInfo.h" #include "lldb/Host/MainLoop.h" #include "lldb/Host/OptionParser.h" #include "lldb/Host/Socket.h" @@ -256,8 +257,9 @@ static void client_handle(GDBRemoteCommunicationServerPlatform &platform, printf("Disconnected.\n"); } -static Status spawn_process(const char *progname, const Socket *conn_socket, -uint16_t gdb_port, const lldb_private::Args &args, +static Status spawn_process(const char *progname, const FileSpec &prog, +const Socket *conn_socket, uint16_t gdb_port, +const lldb_private::Args &args, const std::string &log_file, const StringRef log_channels, MainLoop &main_loop) { Status error; @@ -267,9 +269,10 @@ static Status spawn_process(const char *progname, const Socket *conn_socket, ProcessLaunchInfo launch_info; - FileSpec self_spec(progname, FileSpec::Style::native); - launch_info.SetExecutableFile(self_spec, true); + launch_info.SetExecutableFile(prog, false); + launch_info.SetArg0(progname); Args &self_args = launch_info.GetArguments(); + self_args.AppendArgument(progname); self_args.AppendArgument(llvm::StringRef("platform")); self_args.AppendArgument(llvm::StringRef("--child-platform-fd")); self_args.AppendArgument(llvm::to_string(shared_socket.GetSendableFD())); @@ -551,9 +554,10 @@ int main_platform(int argc, char *argv[]) { log_channels, &main_loop, &platform_handles](std::unique_ptr sock_up) { printf("Connection established.\n"); - Status error = spawn_process(progname, sock_up.get(), - gdbserver_port, inferior_arguments, - log_file, log_channels, main_loop); + Status error = spawn_process( + progname, HostInfo::GetProgramFileSpec(), sock_up.get(), + gdbserver_port, inferior_arguments, log_file, log_channels, + main_loop); if (error.Fail()) { Log *log = GetLog(LLDBLog::Platform); LLDB_LOGF(log, "spawn_process failed: %s", error.AsCString()); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] llvm-reduce: Fix losing fast math flags in operands-to-args (PR #133421)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/133421 >From a59ef1fe4845b29caf23bf27a2ad1343bc94d188 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 28 Mar 2025 18:00:05 +0700 Subject: [PATCH] llvm-reduce: Fix losing fast math flags in operands-to-args --- .../operands-to-args-preserve-fmf.ll | 20 +++ .../deltas/ReduceOperandsToArgs.cpp | 4 2 files changed, 24 insertions(+) create mode 100644 llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll diff --git a/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll new file mode 100644 index 0..b4b19ca28dbb5 --- /dev/null +++ b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll @@ -0,0 +1,20 @@ +; RUN: llvm-reduce %s -o %t --abort-on-invalid-reduction --delta-passes=operands-to-args --test FileCheck --test-arg %s --test-arg --check-prefix=INTERESTING --test-arg --input-file +; RUN: FileCheck %s --input-file %t --check-prefix=REDUCED + +; INTERESTING-LABEL: define float @callee( +; INTERESTING: fadd float +define float @callee(float %a) { + %x = fadd float %a, 1.0 + ret float %x +} + +; INTERESTING-LABEL: define float @caller( +; INTERESTING: load float + +; REDUCED-LABEL: define float @caller(ptr %ptr, float %val, float %callee.ret1) { +; REDUCED: %callee.ret12 = call nnan nsz float @callee(float %val, float 0.00e+00) +define float @caller(ptr %ptr) { + %val = load float, ptr %ptr + %callee.ret = call nnan nsz float @callee(float %val) + ret float %callee.ret +} diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp index 037ff15fae0f6..e7ad52eb65a5d 100644 --- a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp +++ b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp @@ -14,6 +14,7 @@ #include "llvm/IR/InstIterator.h" #include "llvm/IR/InstrTypes.h" #include "llvm/IR/Instructions.h" +#include "llvm/IR/Operator.h" #include "llvm/Transforms/Utils/BasicBlockUtils.h" #include "llvm/Transforms/Utils/Cloning.h" @@ -107,6 +108,9 @@ static void replaceFunctionCalls(Function *OldF, Function *NewF) { NewCI->setCallingConv(NewF->getCallingConv()); NewCI->setAttributes(CI->getAttributes()); +if (auto *FPOp = dyn_cast(NewCI)) + NewCI->setFastMathFlags(CI->getFastMathFlags()); + // Do the replacement for this use. if (!CI->use_empty()) CI->replaceAllUsesWith(NewCI); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [ctxprof][nfc] Move 2 implementation functions up in `CtxInstrProfiling.cpp` (PR #133146)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/133146 >From 5579f73a4ad3d8205608eecde962257077578685 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Wed, 26 Mar 2025 10:10:43 -0700 Subject: [PATCH] [ctxprof][nfc] Move 2 implementation functions up in `CtxInstrProfiling.cpp` --- .../lib/ctx_profile/CtxInstrProfiling.cpp | 66 +-- 1 file changed, 33 insertions(+), 33 deletions(-) diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp index b0e63a8861d86..da291e0bbabdd 100644 --- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp +++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp @@ -244,6 +244,39 @@ ContextNode *getFlatProfile(FunctionData &Data, GUID Guid, return Data.FlatCtx; } +// This should be called once for a Root. Allocate the first arena, set up the +// first context. +void setupContext(ContextRoot *Root, GUID Guid, uint32_t NumCounters, + uint32_t NumCallsites) { + __sanitizer::GenericScopedLock<__sanitizer::SpinMutex> Lock( + &AllContextsMutex); + // Re-check - we got here without having had taken a lock. + if (Root->FirstMemBlock) +return; + const auto Needed = ContextNode::getAllocSize(NumCounters, NumCallsites); + auto *M = Arena::allocateNewArena(getArenaAllocSize(Needed)); + Root->FirstMemBlock = M; + Root->CurrentMem = M; + Root->FirstNode = allocContextNode(M->tryBumpAllocate(Needed), Guid, + NumCounters, NumCallsites); + AllContextRoots.PushBack(Root); +} + +ContextRoot *FunctionData::getOrAllocateContextRoot() { + auto *Root = CtxRoot; + if (Root) +return Root; + __sanitizer::GenericScopedLock<__sanitizer::StaticSpinMutex> L(&Mutex); + Root = CtxRoot; + if (!Root) { +Root = new (__sanitizer::InternalAlloc(sizeof(ContextRoot))) ContextRoot(); +CtxRoot = Root; + } + + assert(Root); + return Root; +} + ContextNode *getUnhandledContext(FunctionData &Data, GUID Guid, uint32_t NumCounters) { @@ -333,39 +366,6 @@ ContextNode *__llvm_ctx_profile_get_context(FunctionData *Data, void *Callee, return Ret; } -// This should be called once for a Root. Allocate the first arena, set up the -// first context. -void setupContext(ContextRoot *Root, GUID Guid, uint32_t NumCounters, - uint32_t NumCallsites) { - __sanitizer::GenericScopedLock<__sanitizer::SpinMutex> Lock( - &AllContextsMutex); - // Re-check - we got here without having had taken a lock. - if (Root->FirstMemBlock) -return; - const auto Needed = ContextNode::getAllocSize(NumCounters, NumCallsites); - auto *M = Arena::allocateNewArena(getArenaAllocSize(Needed)); - Root->FirstMemBlock = M; - Root->CurrentMem = M; - Root->FirstNode = allocContextNode(M->tryBumpAllocate(Needed), Guid, - NumCounters, NumCallsites); - AllContextRoots.PushBack(Root); -} - -ContextRoot *FunctionData::getOrAllocateContextRoot() { - auto *Root = CtxRoot; - if (Root) -return Root; - __sanitizer::GenericScopedLock<__sanitizer::StaticSpinMutex> L(&Mutex); - Root = CtxRoot; - if (!Root) { -Root = new (__sanitizer::InternalAlloc(sizeof(ContextRoot))) ContextRoot(); -CtxRoot = Root; - } - - assert(Root); - return Root; -} - ContextNode *__llvm_ctx_profile_start_context( FunctionData *FData, GUID Guid, uint32_t Counters, uint32_t Callsites) SANITIZER_NO_THREAD_SAFETY_ANALYSIS { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] compiler-rt: Introduce runtime functions for emulated PAC. (PR #133530)
@@ -0,0 +1,7343 @@ +/* + * xxHash - Extremely Fast Hash algorithm + * Header File + * Copyright (C) 2012-2023 Yann Collet + * + * BSD 2-Clause License (https://www.opensource.org/licenses/bsd-license.php) kbeyls wrote: This is a license different from Apache-2.0 WITH LLVM-exception. Therefore, the process described at https://llvm.org/docs/DeveloperPolicy.html#copyright-license-and-patents should be followed to check whether this is acceptable in this specific case. That being said, xxhash is already present under [llvm/lib/Support/xxhash.cpp](https://github.com/llvm/llvm-project/blob/21eeca3db0341fef4ab4a6464ffe38b2eba5810c/llvm/lib/Support/xxhash.cpp#L163), as you pointed out in the [RFC](https://discourse.llvm.org/t/rfc-emulated-pac/85557). Making sure we don't have multiple copies of non-Apache-2.0 WITH LLVM-exception code would be preferable. I'll tag @beanz, as he had ideas about how to better structure vendored third party code in LLVM. I'm not sure if moving non-Apache-2.0 WITH LLVM-exception licensed code to a run-time library (for the first time?) triggers new concerns. I'll note that other hashing algorithms, such as [Blake3](https://github.com/llvm/llvm-project/blob/21eeca3db0341fef4ab4a6464ffe38b2eba5810c/llvm/include/llvm/Support/BLAKE3.h#L1) and [SipHash](https://github.com/llvm/llvm-project/blob/21eeca3db0341fef4ab4a6464ffe38b2eba5810c/llvm/lib/Support/SipHash.cpp#L1), which are available under the Apache-2.0 WITH LLVM-exception, are also already present in the LLVM Support library. Would one of these preferably-licensed hashing algorithms be a good fit for the hashing functionality needed for this use case? https://github.com/llvm/llvm-project/pull/133530 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits