[llvm-branch-commits] [compiler-rt] release/20.x: XFAIL malloc_zone.cpp for darwin/lsan (#131234) (PR #133006)

2025-04-05 Thread Mariusz Borsa via llvm-branch-commits

wrotki wrote:

Closing this one as it's a bit messy. Opened new PR , cleaned up:  
https://github.com/llvm/llvm-project/pull/133832/files

https://github.com/llvm/llvm-project/pull/133006
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Clang][CodeGen] Promote in complex compound divassign (PR #131453)

2025-04-05 Thread Mészáros Gergely via llvm-branch-commits

https://github.com/Maetveis updated 
https://github.com/llvm/llvm-project/pull/131453

From 9d50aa09e1f06ec145715896173750414ec75c0d Mon Sep 17 00:00:00 2001
From: Gergely Meszaros 
Date: Sat, 15 Mar 2025 12:53:32 +0100
Subject: [PATCH] [Clang][CodeGen] Promote in complex compound divassign

When `-fcomplex-arithmetic=promoted` is set complex divassign `/=` should
promote to a wider type the same way division (without assignment) does.
Prior to this change, Smith's algorithm would be used for divassign.

Fixes: https://github.com/llvm/llvm-project/issues/131129
---
 clang/lib/CodeGen/CGExprComplex.cpp   |  13 +-
 clang/test/CodeGen/cx-complex-range.c | 534 ++
 2 files changed, 221 insertions(+), 326 deletions(-)

diff --git a/clang/lib/CodeGen/CGExprComplex.cpp 
b/clang/lib/CodeGen/CGExprComplex.cpp
index 34f40feac7958..a7c8b96da6853 100644
--- a/clang/lib/CodeGen/CGExprComplex.cpp
+++ b/clang/lib/CodeGen/CGExprComplex.cpp
@@ -1214,13 +1214,16 @@ EmitCompoundAssignLValue(const CompoundAssignOperator 
*E,
   OpInfo.FPFeatures = E->getFPFeaturesInEffect(CGF.getLangOpts());
   CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, OpInfo.FPFeatures);
 
+  const bool IsComplexDivisor = E->getOpcode() == BO_DivAssign &&
+E->getRHS()->getType()->isAnyComplexType();
+
   // Load the RHS and LHS operands.
   // __block variables need to have the rhs evaluated first, plus this should
   // improve codegen a little.
   QualType PromotionTypeCR;
-  PromotionTypeCR = getPromotionType(E->getStoredFPFeaturesOrDefault(),
- E->getComputationResultType(),
- /*IsComplexDivisor=*/false);
+  PromotionTypeCR =
+  getPromotionType(E->getStoredFPFeaturesOrDefault(),
+   E->getComputationResultType(), IsComplexDivisor);
   if (PromotionTypeCR.isNull())
 PromotionTypeCR = E->getComputationResultType();
   OpInfo.Ty = PromotionTypeCR;
@@ -1228,7 +1231,7 @@ EmitCompoundAssignLValue(const CompoundAssignOperator *E,
   OpInfo.Ty->castAs()->getElementType();
   QualType PromotionTypeRHS =
   getPromotionType(E->getStoredFPFeaturesOrDefault(),
-   E->getRHS()->getType(), /*IsComplexDivisor=*/false);
+   E->getRHS()->getType(), IsComplexDivisor);
 
   // The RHS should have been converted to the computation type.
   if (E->getRHS()->getType()->isRealFloatingType()) {
@@ -1258,7 +1261,7 @@ EmitCompoundAssignLValue(const CompoundAssignOperator *E,
   SourceLocation Loc = E->getExprLoc();
   QualType PromotionTypeLHS =
   getPromotionType(E->getStoredFPFeaturesOrDefault(),
-   E->getComputationLHSType(), /*IsComplexDivisor=*/false);
+   E->getComputationLHSType(), IsComplexDivisor);
   if (LHSTy->isAnyComplexType()) {
 ComplexPairTy LHSVal = EmitLoadOfLValue(LHS, Loc);
 if (!PromotionTypeLHS.isNull())
diff --git a/clang/test/CodeGen/cx-complex-range.c 
b/clang/test/CodeGen/cx-complex-range.c
index 06a349fbc2a47..a724e1ca8cb6d 100644
--- a/clang/test/CodeGen/cx-complex-range.c
+++ b/clang/test/CodeGen/cx-complex-range.c
@@ -721,44 +721,32 @@ _Complex float divf(_Complex float a, _Complex float b) {
 // PRMTD-NEXT:[[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4
 // PRMTD-NEXT:[[B_IMAGP:%.*]] = getelementptr inbounds nuw { float, float 
}, ptr [[B]], i32 0, i32 1
 // PRMTD-NEXT:[[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4
+// PRMTD-NEXT:[[EXT:%.*]] = fpext float [[B_REAL]] to double
+// PRMTD-NEXT:[[EXT1:%.*]] = fpext float [[B_IMAG]] to double
 // PRMTD-NEXT:[[TMP0:%.*]] = load ptr, ptr [[A_ADDR]], align 8
 // PRMTD-NEXT:[[DOTREALP:%.*]] = getelementptr inbounds nuw { float, float 
}, ptr [[TMP0]], i32 0, i32 0
 // PRMTD-NEXT:[[DOTREAL:%.*]] = load float, ptr [[DOTREALP]], align 4
 // PRMTD-NEXT:[[DOTIMAGP:%.*]] = getelementptr inbounds nuw { float, float 
}, ptr [[TMP0]], i32 0, i32 1
 // PRMTD-NEXT:[[DOTIMAG:%.*]] = load float, ptr [[DOTIMAGP]], align 4
-// PRMTD-NEXT:[[TMP1:%.*]] = call float @llvm.fabs.f32(float [[B_REAL]])
-// PRMTD-NEXT:[[TMP2:%.*]] = call float @llvm.fabs.f32(float [[B_IMAG]])
-// PRMTD-NEXT:[[ABS_CMP:%.*]] = fcmp ugt float [[TMP1]], [[TMP2]]
-// PRMTD-NEXT:br i1 [[ABS_CMP]], label 
[[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label 
[[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
-// PRMTD:   abs_rhsr_greater_or_equal_abs_rhsi:
-// PRMTD-NEXT:[[TMP3:%.*]] = fdiv float [[B_IMAG]], [[B_REAL]]
-// PRMTD-NEXT:[[TMP4:%.*]] = fmul float [[TMP3]], [[B_IMAG]]
-// PRMTD-NEXT:[[TMP5:%.*]] = fadd float [[B_REAL]], [[TMP4]]
-// PRMTD-NEXT:[[TMP6:%.*]] = fmul float [[DOTIMAG]], [[TMP3]]
-// PRMTD-NEXT:[[TMP7:%.*]] = fadd float [[DOTREAL]], [[TMP6]]
-// PRMTD-NEXT:[[TMP8:%.*]] = fdiv float [[TMP7]], [[TMP5]]
-// PRMTD-NEXT:[[TMP9:%.*]] = fmul float [[DOTREAL]], [[TMP3]]
-// PRMTD-NEXT:[[TMP10:

[llvm-branch-commits] [llvm] llvm-reduce: Fix losing fast math flags in operands-to-args (PR #133421)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/133421

>From 02186b904f0aefc91d83431f1de4c08f5c11909f Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 28 Mar 2025 18:00:05 +0700
Subject: [PATCH] llvm-reduce: Fix losing fast math flags in operands-to-args

---
 .../operands-to-args-preserve-fmf.ll  | 20 +++
 .../deltas/ReduceOperandsToArgs.cpp   |  4 
 2 files changed, 24 insertions(+)
 create mode 100644 llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll

diff --git a/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll 
b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll
new file mode 100644
index 0..b4b19ca28dbb5
--- /dev/null
+++ b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll
@@ -0,0 +1,20 @@
+; RUN: llvm-reduce %s -o %t --abort-on-invalid-reduction 
--delta-passes=operands-to-args --test FileCheck --test-arg %s --test-arg 
--check-prefix=INTERESTING --test-arg --input-file
+; RUN: FileCheck %s --input-file %t --check-prefix=REDUCED
+
+; INTERESTING-LABEL: define float @callee(
+; INTERESTING: fadd float
+define float @callee(float %a) {
+  %x = fadd float %a, 1.0
+  ret float %x
+}
+
+; INTERESTING-LABEL: define float @caller(
+; INTERESTING: load float
+
+; REDUCED-LABEL: define float @caller(ptr %ptr, float %val, float 
%callee.ret1) {
+; REDUCED: %callee.ret12 = call nnan nsz float @callee(float %val, float 
0.00e+00)
+define float @caller(ptr %ptr) {
+  %val = load float, ptr %ptr
+  %callee.ret = call nnan nsz float @callee(float %val)
+  ret float %callee.ret
+}
diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp 
b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp
index b9e07f2c9f63c..e1c1c9c7372f9 100644
--- a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp
+++ b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp
@@ -14,6 +14,7 @@
 #include "llvm/IR/InstIterator.h"
 #include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/Instructions.h"
+#include "llvm/IR/Operator.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
 #include "llvm/Transforms/Utils/Cloning.h"
 
@@ -107,6 +108,9 @@ static void replaceFunctionCalls(Function *OldF, Function 
*NewF) {
 NewCI->setCallingConv(NewF->getCallingConv());
 NewCI->setAttributes(CI->getAttributes());
 
+if (auto *FPOp = dyn_cast(NewCI))
+  NewCI->setFastMathFlags(CI->getFastMathFlags());
+
 // Do the replacement for this use.
 if (!CI->use_empty())
   CI->replaceAllUsesWith(NewCI);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Metadata] Preserve MD_prof when merging instructions when one is missing. (PR #132433)

2025-04-05 Thread Teresa Johnson via llvm-branch-commits

https://github.com/teresajohnson approved this pull request.


https://github.com/llvm/llvm-project/pull/132433
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)

2025-04-05 Thread Peter Collingbourne via llvm-branch-commits

https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132291

>From 89ce369ab9b49b8c23a87ad0a888002dd85c094c Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Thu, 20 Mar 2025 15:12:39 -0700
Subject: [PATCH] Format

Created using spr 1.3.6-beta.1
---
 libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py 
b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
index 630b90c9d77a6..927f8958f4b43 100644
--- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
+++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
@@ -30,7 +30,8 @@
 # we exit.
 has_run_tests = False
 
-has_execute_mi = 'execute_mi' in gdb.__dict__
+has_execute_mi = "execute_mi" in gdb.__dict__
+
 
 class CheckResult(gdb.Command):
 def __init__(self):

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lldb] release/20.x: [lldb] Respect LaunchInfo::SetExecutable in ProcessLauncherPosixFork (#133093) (PR #134079)

2025-04-05 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/134079
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] llvm-reduce: Fix introducing unreachable code in simplify conditionals (PR #133842)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/133842?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#133842** https://app.graphite.dev/github/pr/llvm/llvm-project/133842?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/133842?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#133841** https://app.graphite.dev/github/pr/llvm/llvm-project/133841?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/133842
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [LoongArch][MC] Add relocation support for fld fst [x]vld [x]vst (PR #133836)

2025-04-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-loongarch

Author: None (llvmbot)


Changes

Backport 725a7b664b92cd2e884806de5a08900b43d43cce 
d055e58334a91dcbaee22eb87bcdae85a1f33cd4

Requested by: @SixWeining

---
Full diff: https://github.com/llvm/llvm-project/pull/133836.diff


6 Files Affected:

- (modified) llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td (+2-2) 
- (modified) llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td (+2-2) 
- (modified) llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td (+2-2) 
- (modified) llvm/test/MC/LoongArch/Relocations/relocations.s (+30) 
- (modified) llvm/test/MC/LoongArch/lasx/invalid-imm.s (+6-6) 
- (modified) llvm/test/MC/LoongArch/lsx/invalid-imm.s (+6-6) 


``diff
diff --git a/llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td 
b/llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td
index f66f620ca8b26..ce42236895c76 100644
--- a/llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td
+++ b/llvm/lib/Target/LoongArch/LoongArchFloatInstrFormats.td
@@ -206,7 +206,7 @@ class FP_LOAD_3R op, RegisterClass rc = FPR32>
 : FPFmtMEM;
 class FP_LOAD_2RI12 op, RegisterClass rc = FPR32>
-: FPFmt2RI12;
 } // hasSideEffects = 0, mayLoad = 1, mayStore = 0
 
@@ -215,7 +215,7 @@ class FP_STORE_3R op, RegisterClass rc = FPR32>
 : FPFmtMEM;
 class FP_STORE_2RI12 op, RegisterClass rc = FPR32>
-: FPFmt2RI12;
 } // hasSideEffects = 0, mayLoad = 0, mayStore = 1
 
diff --git a/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td 
b/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
index 24b5ed5a9344f..7022fddf34100 100644
--- a/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+++ b/llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
@@ -186,10 +186,10 @@ class LASX2RI10_Load op, Operand ImmOpnd = 
simm10_lsl2>
 class LASX2RI11_Load op, Operand ImmOpnd = simm11_lsl1>
 : Fmt2RI11_XRI;
-class LASX2RI12_Load op, Operand ImmOpnd = simm12>
+class LASX2RI12_Load op, Operand ImmOpnd = simm12_addlike>
 : Fmt2RI12_XRI;
-class LASX2RI12_Store op, Operand ImmOpnd = simm12>
+class LASX2RI12_Store op, Operand ImmOpnd = simm12_addlike>
 : Fmt2RI12_XRI;
 
diff --git a/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td 
b/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
index d2063a8aaae9b..e37de4f545a2a 100644
--- a/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+++ b/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
@@ -374,10 +374,10 @@ class LSX2RI10_Load op, Operand ImmOpnd = 
simm10_lsl2>
 class LSX2RI11_Load op, Operand ImmOpnd = simm11_lsl1>
 : Fmt2RI11_VRI;
-class LSX2RI12_Load op, Operand ImmOpnd = simm12>
+class LSX2RI12_Load op, Operand ImmOpnd = simm12_addlike>
 : Fmt2RI12_VRI;
-class LSX2RI12_Store op, Operand ImmOpnd = simm12>
+class LSX2RI12_Store op, Operand ImmOpnd = simm12_addlike>
 : Fmt2RI12_VRI;
 
diff --git a/llvm/test/MC/LoongArch/Relocations/relocations.s 
b/llvm/test/MC/LoongArch/Relocations/relocations.s
index 091dce200b7de..f91a941295d9e 100644
--- a/llvm/test/MC/LoongArch/Relocations/relocations.s
+++ b/llvm/test/MC/LoongArch/Relocations/relocations.s
@@ -308,3 +308,33 @@ pcaddi $t1, %desc_pcrel_20(foo)
 # RELOC: R_LARCH_TLS_DESC_PCREL20_S2 foo 0x0
 # INSTR: pcaddi $t1, %desc_pcrel_20(foo)
 # FIXUP: fixup A - offset: 0, value: %desc_pcrel_20(foo), kind: FK_NONE
+
+fld.s $ft1, $a0, %pc_lo12(foo)
+# RELOC: R_LARCH_PCALA_LO12 foo 0x0
+# INSTR: fld.s $ft1, $a0, %pc_lo12(foo)
+# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE
+
+fst.d $ft1, $a0, %pc_lo12(foo)
+# RELOC: R_LARCH_PCALA_LO12 foo 0x0
+# INSTR: fst.d $ft1, $a0, %pc_lo12(foo)
+# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE
+
+vld $vr9, $a0, %pc_lo12(foo)
+# RELOC: R_LARCH_PCALA_LO12 foo 0x0
+# INSTR: vld $vr9, $a0, %pc_lo12(foo)
+# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE
+
+vst $vr9, $a0, %pc_lo12(foo)
+# RELOC: R_LARCH_PCALA_LO12 foo 0x0
+# INSTR: vst $vr9, $a0, %pc_lo12(foo)
+# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE
+
+xvld $xr9, $a0, %pc_lo12(foo)
+# RELOC: R_LARCH_PCALA_LO12 foo 0x0
+# INSTR: xvld $xr9, $a0, %pc_lo12(foo)
+# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE
+
+xvst $xr9, $a0, %pc_lo12(foo)
+# RELOC: R_LARCH_PCALA_LO12 foo 0x0
+# INSTR: xvst $xr9, $a0, %pc_lo12(foo)
+# FIXUP: fixup A - offset: 0, value: %pc_lo12(foo), kind: FK_NONE
diff --git a/llvm/test/MC/LoongArch/lasx/invalid-imm.s 
b/llvm/test/MC/LoongArch/lasx/invalid-imm.s
index 6f64a6f87802b..adfd35367d7ba 100644
--- a/llvm/test/MC/LoongArch/lasx/invalid-imm.s
+++ b/llvm/test/MC/LoongArch/lasx/invalid-imm.s
@@ -1167,22 +1167,22 @@ xvldrepl.h $xr0, $a0, 2048
 
 ## simm12
 xvldrepl.b $xr0, $a0, -2049
-# CHECK: :[[#@LINE-1]]:23: error: immediate must be an integer in the range 
[-2048, 2047]
+# CHECK: :[[#@LINE-1]]:23: error: operand must be a symbol with modifier (e.g. 
%pc_lo12) or an integer in the range [-2048, 2047]
 
 xvldrepl.b $xr0, $a0, 2048
-# CHECK: :[[#@LINE-1]]

[llvm-branch-commits] AArch64: Relax x16/x17 constraint on AUT in certain cases. (PR #132857)

2025-04-05 Thread Peter Collingbourne via llvm-branch-commits

https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132857


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: reformulate the state for data-flow analysis (PR #131898)

2025-04-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/131898

>From da27c6c3ddaf09a97fff98365b457eb1e86828b0 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 17 Mar 2025 22:27:53 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: reformulate the state for
 data-flow analysis

In preparation for implementing support for detection of non-protected
call instructions, refine the definition of state which is computed for
each register by data-flow analysis.

Explicitly marking the registers which are known to be trusted at
function entry is crucial for finding non-protected calls. In addition,
it fixes less-common false negatives for pac-ret, such as `ret x1` in
`f_nonx30_ret_non_auted` test case.
---
 bolt/include/bolt/Core/MCPlusBuilder.h|  10 ++
 bolt/include/bolt/Passes/PAuthGadgetScanner.h |   7 +-
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 129 +++---
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |   4 +
 .../AArch64/gs-pacret-autiasp.s   |  19 ++-
 .../AArch64/gs-pacret-multi-bb.s  |   3 +-
 6 files changed, 104 insertions(+), 68 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index b285138b77fe7..76ea2489e7038 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -551,6 +551,16 @@ class MCPlusBuilder {
 return Analysis->isReturn(Inst);
   }
 
+  /// Returns the registers that are trusted at function entry.
+  ///
+  /// Each register should be treated as if a successfully authenticated
+  /// pointer was written to it before entering the function (i.e. the
+  /// pointer is safe to jump to as well as to be signed).
+  virtual SmallVector getTrustedLiveInRegs() const {
+llvm_unreachable("not implemented");
+return {};
+  }
+
   virtual ErrorOr getAuthenticatedReg(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
 return getNoRegister();
diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index f102f1080e2e8..404dde2901767 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -209,13 +209,12 @@ struct Report {
 
 struct GadgetReport : public Report {
   const GadgetKind &Kind;
-  SmallVector AffectedRegisters;
+  SmallVector AffectedRegisters;
   std::vector OverwritingInstrs;
 
   GadgetReport(const GadgetKind &Kind, MCInstReference Location,
-   const BitVector &AffectedRegisters)
-  : Report(Location), Kind(Kind),
-AffectedRegisters(AffectedRegisters.set_bits()) {}
+   MCPhysReg AffectedRegister)
+  : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {}
 
   void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
 
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 163e26c68cb9a..93a452b224233 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -126,18 +126,16 @@ class TrackedRegisters {
 
 // The security property that is checked is:
 // When a register is used as the address to jump to in a return instruction,
-// that register must either:
-// (a) never be changed within this function, i.e. have the same value as when
-// the function started, or
+// that register must be safe-to-dereference. It must either
+// (a) be safe-to-dereference at function entry and never be changed within 
this
+// function, i.e. have the same value as when the function started, or
 // (b) the last write to the register must be by an authentication instruction.
 
 // This property is checked by using dataflow analysis to keep track of which
-// registers have been written (def-ed), since last authenticated. Those are
-// exactly the registers containing values that should not be trusted (as they
-// could have changed since the last time they were authenticated). For 
pac-ret,
-// any return instruction using such a register is a gadget to be reported. For
-// PAuthABI, probably at least any indirect control flow using such a register
-// should be reported.
+// registers have been written (def-ed), since last authenticated. For pac-ret,
+// any return instruction using a register which is not safe-to-dereference is
+// a gadget to be reported. For PAuthABI, probably at least any indirect 
control
+// flow using such a register should be reported.
 
 // Furthermore, when producing a diagnostic for a found non-pac-ret protected
 // return, the analysis also lists the last instructions that wrote to the
@@ -156,10 +154,29 @@ class TrackedRegisters {
 //in the gadgets to be reported. This information is used in the second run
 //to also track which instructions last wrote to those registers.
 
+/// A state representing which registers are safe to use by an instruction
+/// at a given program p

[llvm-branch-commits] [libcxxabi] [release/18.x][backport][libc++abi] Use __has_feature check to enable usage of thread_local for exception storage (PR #132241)

2025-04-05 Thread Louis Dionne via llvm-branch-commits

ldionne wrote:

(I'm going to tentatively close this since as I said we're not cherry-picking 
stuff back to LLVM 18 anymore, please reopen for more discussion)

https://github.com/llvm/llvm-project/pull/132241
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)

2025-04-05 Thread Sam Tebbs via llvm-branch-commits


@@ -5026,10 +5026,24 @@ calculateRegisterUsage(VPlan &Plan, 
ArrayRef VFs,
 // even in the scalar case.
 RegUsage[ClassID] += 1;
   } else {
+// The output from scaled phis and scaled reductions actually have
+// fewer lanes than the VF.
+auto VF = VFs[J];

SamTebbs33 wrote:

Done.

https://github.com/llvm/llvm-project/pull/133090
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] llvm-reduce: Reduce global variable code model (PR #133865)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/133865

The current API doesn't have a way to unset it. The query returns
an optional, but the set doesn't. Alternatively I could switch the
set to also use optional.

>From 0336fe4e9c81d14560478be572b3ab970325552f Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 1 Apr 2025 12:38:18 +0700
Subject: [PATCH] llvm-reduce: Reduce global variable code model

The current API doesn't have a way to unset it. The query returns
an optional, but the set doesn't. Alternatively I could switch the
set to also use optional.
---
 llvm/include/llvm/IR/GlobalVariable.h  |  4 
 llvm/lib/IR/Globals.cpp|  9 +
 .../tools/llvm-reduce/reduce-code-model.ll | 18 ++
 .../llvm-reduce/deltas/ReduceGlobalValues.cpp  |  3 ++-
 4 files changed, 33 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/tools/llvm-reduce/reduce-code-model.ll

diff --git a/llvm/include/llvm/IR/GlobalVariable.h 
b/llvm/include/llvm/IR/GlobalVariable.h
index 83e484816d7d4..5ea5d3b11cd9a 100644
--- a/llvm/include/llvm/IR/GlobalVariable.h
+++ b/llvm/include/llvm/IR/GlobalVariable.h
@@ -289,6 +289,10 @@ class GlobalVariable : public GlobalObject, public 
ilist_node {
   ///
   void setCodeModel(CodeModel::Model CM);
 
+  /// Remove the code model for this global.
+  ///
+  void clearCodeModel();
+
   // Methods for support type inquiry through isa, cast, and dyn_cast:
   static bool classof(const Value *V) {
 return V->getValueID() == Value::GlobalVariableVal;
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 8ca44719a3f94..401f8ac58bce8 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -557,6 +557,15 @@ void GlobalVariable::setCodeModel(CodeModel::Model CM) {
   assert(getCodeModel() == CM && "Code model representation error!");
 }
 
+void GlobalVariable::clearCodeModel() {
+  unsigned CodeModelData = 0;
+  unsigned OldData = getGlobalValueSubClassData();
+  unsigned NewData = (OldData & ~(CodeModelMask << CodeModelShift)) |
+ (CodeModelData << CodeModelShift);
+  setGlobalValueSubClassData(NewData);
+  assert(getCodeModel() == std::nullopt && "Code model representation error!");
+}
+
 
//===--===//
 // GlobalAlias Implementation
 
//===--===//
diff --git a/llvm/test/tools/llvm-reduce/reduce-code-model.ll 
b/llvm/test/tools/llvm-reduce/reduce-code-model.ll
new file mode 100644
index 0..898f5995d9826
--- /dev/null
+++ b/llvm/test/tools/llvm-reduce/reduce-code-model.ll
@@ -0,0 +1,18 @@
+; RUN: llvm-reduce -abort-on-invalid-reduction --delta-passes=global-values 
--test FileCheck --test-arg --check-prefix=INTERESTING --test-arg %s --test-arg 
--input-file %s -o %t.0
+; RUN: FileCheck --implicit-check-not=define --check-prefix=RESULT %s < %t.0
+
+; INTERESTING: @code_model_large_keep = global i32 0, code_model "large", 
align 4
+; INTERESTING @code_model_large_drop = global i32 0
+
+; RESULT: @code_model_large_keep = global i32 0, code_model "large", align 
4{{$}}
+; RESULT: @code_model_large_drop = global i32 0, align 4{{$}}
+@code_model_large_keep = global i32 0, code_model "large", align 4
+@code_model_large_drop = global i32 0, code_model "large", align 4
+
+; INTERESTING: @code_model_tiny_keep = global i32 0, code_model "tiny", align 4
+; INTERESTING @code_model_tiny_drop = global i32 0
+
+; RESULT: @code_model_tiny_keep = global i32 0, code_model "tiny", align 4{{$}}
+; RESULT: @code_model_tiny_drop = global i32 0, align 4{{$}}
+@code_model_tiny_keep = global i32 0, code_model "tiny", align 4
+@code_model_tiny_drop = global i32 0, code_model "tiny", align 4
diff --git a/llvm/tools/llvm-reduce/deltas/ReduceGlobalValues.cpp 
b/llvm/tools/llvm-reduce/deltas/ReduceGlobalValues.cpp
index e56876c38032e..659bf8dd23eff 100644
--- a/llvm/tools/llvm-reduce/deltas/ReduceGlobalValues.cpp
+++ b/llvm/tools/llvm-reduce/deltas/ReduceGlobalValues.cpp
@@ -70,7 +70,8 @@ void llvm::reduceGlobalValuesDeltaPass(Oracle &O, 
ReducerWorkItem &Program) {
   if (GVar->isExternallyInitialized() && !O.shouldKeep())
 GVar->setExternallyInitialized(false);
 
-  // TODO: Reduce code model
+  if (GVar->getCodeModel() && !O.shouldKeep())
+GVar->clearCodeModel();
 }
   }
 }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/20.x: [libcxx] [test] Fix restoring LLVM_DIR and Clang_DIR (#132838) (PR #133153)

2025-04-05 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/133153

>From 44a6f6abbdb6f0eebfaf1ad6f601c29f80782de7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Wed, 26 Mar 2025 22:13:28 +0200
Subject: [PATCH] [libcxx] [test] Fix restoring LLVM_DIR and Clang_DIR
 (#132838)

In 664f345cd53d1f624d94f9889a1c9fff803e3391, a fix was introduced,
attempting to restore LLVM_DIR and Clang_DIR after doing
find_package(Clang).

However, 6775285e7695f2d45cf455f5d31b2c9fa9362d3d added a return if the
clangTidy target wasn't found. If this is hit, we don't restore LLVM_DIR
and Clang_DIR, which causes strange effects if CMake is rerun a second
time.

Move the code for restoring LLVM_DIR and Clang_DIR to directly after the
find_package calls, to make sure they are restored, regardless of the
find_package outcome.

(cherry picked from commit 51bceb46f8eeb7c3d060387be315ca41855933c2)
---
 libcxx/test/tools/clang_tidy_checks/CMakeLists.txt | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libcxx/test/tools/clang_tidy_checks/CMakeLists.txt 
b/libcxx/test/tools/clang_tidy_checks/CMakeLists.txt
index 0f8f0e8864d0f..da045fac92ce4 100644
--- a/libcxx/test/tools/clang_tidy_checks/CMakeLists.txt
+++ b/libcxx/test/tools/clang_tidy_checks/CMakeLists.txt
@@ -8,6 +8,10 @@ set(Clang_DIR_SAVE ${Clang_DIR})
 # versions must match. Otherwise there likely will be ODR-violations. This had
 # led to crashes and incorrect output of the clang-tidy based checks.
 find_package(Clang ${CMAKE_CXX_COMPILER_VERSION})
+
+set(LLVM_DIR "${LLVM_DIR_SAVE}" CACHE PATH "The directory containing a CMake 
configuration file for LLVM." FORCE)
+set(Clang_DIR "${Clang_DIR_SAVE}" CACHE PATH "The directory containing a CMake 
configuration file for Clang." FORCE)
+
 if(NOT Clang_FOUND)
   message(STATUS "Clang-tidy tests are disabled since the "
  "Clang development package is unavailable.")
@@ -19,9 +23,6 @@ if(NOT TARGET clangTidy)
   return()
 endif()
 
-set(LLVM_DIR "${LLVM_DIR_SAVE}" CACHE PATH "The directory containing a CMake 
configuration file for LLVM." FORCE)
-set(Clang_DIR "${Clang_DIR_SAVE}" CACHE PATH "The directory containing a CMake 
configuration file for Clang." FORCE)
-
 message(STATUS "Found system-installed LLVM ${LLVM_PACKAGE_VERSION} with 
headers in ${LLVM_INCLUDE_DIRS}")
 
 set(CMAKE_CXX_STANDARD 20)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][StaticDataSplitter]Support constant pool partitioning (PR #129781)

2025-04-05 Thread Mingming Liu via llvm-branch-commits


@@ -0,0 +1,141 @@
+; RUN: llc -mtriple=aarch64 -enable-split-machine-functions \
+; RUN: -partition-static-data-sections=true -function-sections=true \
+; RUN: -unique-section-names=false \
+; RUN: %s -o - 2>&1 | FileCheck %s --dump-input=always
+
+; Repeat the RUN command above for big-endian systems.
+; RUN: llc -mtriple=aarch64_be -enable-split-machine-functions \
+; RUN: -partition-static-data-sections=true -function-sections=true \
+; RUN: -unique-section-names=false \
+; RUN: %s -o - 2>&1 | FileCheck %s --dump-input=always
+
+; Tests that constant pool hotness is aggregated across the module. The
+; static-data-splitter processes data from cold_func first, unprofiled_func
+; secondly, and then hot_func. Specifically, tests that
+; - If a constant is accessed by hot functions, all constant pools for this
+;   constant (e.g., from an unprofiled function, or cold function) should have
+;   `.hot` suffix.
+; - Similarly if a constant is accessed by both cold function and un-profiled
+;   function, constant pools for this constant should not have `.unlikely` 
suffix.
+
+; CHECK: .section  .rodata.cst8.hot,"aM",@progbits,8
+; CHECK: .LCPI0_0:

mingmingl-llvm wrote:

Yes. Constant pools for the same function are emitted back to back and labels 
are named like `_LCPI_`. Grouped them by functions 
and use `CHECK-NEXT` in each group to make the test tighter.

https://github.com/llvm/llvm-project/pull/129781
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds (PR #132353)

2025-04-05 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/132353

>From a8155cf5b7847a041be8d4252b20cae01d305404 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Fri, 21 Mar 2025 03:33:02 -0400
Subject: [PATCH] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds

For flat memory instructions where the address is supplied as a base address
register with an immediate offset, the memory aperture test ignores the
immediate offset. Currently, ISel does not respect that, which leads to
miscompilations where valid input programs crash when the address computation
relies on the immediate offset to get the base address in the proper memory
aperture. Global or scratch instructions are not affected.

This patch only selects flat instructions with immediate offsets from address
computations with the inbounds flag: If the address computation does not leave
the bounds of the allocated object, it cannot leave the bounds of the memory
aperture and is therefore safe to handle with an immediate offset.

It also adds the inbounds flag to DAG nodes resulting from transformations:
- Address computations resulting from getObjectPtrOffset. As far as I can tell,
  this function is only used to compute addresses within accessed memory ranges,
  e.g., for loads and stores that are split during legalization.
- Reassociated inbounds adds. If both involved operations are inbounds, then so
  are operations after the transformation.
- Address computations in the SelectionDAG lowering of the memcpy/move/set
  intrinsics. Base and result of the address arithmetic there are accessed, so
  the operation must be inbounds.

It might make sense to separate these changes into their own PR, but I don't
see a way to test them without adding a use of the inbounds SDAG flag.

Affected tests:
- CodeGen/AMDGPU/fold-gep-offset.ll: Offsets are no longer wrongly folded,
  added new positive tests where we still do fold them.
- Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll: Offset folding doesn't
  seem integral to this test, so the test is not changed to make offset folding
  still happen.
- CodeGen/AMDGPU/loop-prefetch-data.ll: loop-reduce prefers to base addresses
  on the potentially OOB addresses used for prefetching for memory accesses,
  that might be a separate issue to look into.
- Added memset tests to CodeGen/AMDGPU/memintrinsic-unroll.ll to make sure that
  offsets in the memset DAG lowering are still folded properly.

A similar patch for GlobalISel will follow.

Fixes SWDEV-516125.
---
 llvm/include/llvm/CodeGen/SelectionDAG.h  |  12 +-
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |   9 +-
 .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp |  12 +-
 llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp | 140 ---
 llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll   | 374 +-
 .../test/CodeGen/AMDGPU/loop-prefetch-data.ll |  17 +-
 .../CodeGen/AMDGPU/memintrinsic-unroll.ll | 241 +++
 .../InferAddressSpaces/AMDGPU/flat_atomic.ll  |   6 +-
 8 files changed, 717 insertions(+), 94 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h 
b/llvm/include/llvm/CodeGen/SelectionDAG.h
index 15a2370e5d8b8..aa3668d3e9aae 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAG.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAG.h
@@ -1069,7 +1069,8 @@ class SelectionDAG {
  SDValue EVL);
 
   /// Returns sum of the base pointer and offset.
-  /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap by default.
+  /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap and InBounds 
by
+  /// default.
   SDValue getMemBasePlusOffset(SDValue Base, TypeSize Offset, const SDLoc &DL,
const SDNodeFlags Flags = SDNodeFlags());
   SDValue getMemBasePlusOffset(SDValue Base, SDValue Offset, const SDLoc &DL,
@@ -1077,15 +1078,18 @@ class SelectionDAG {
 
   /// Create an add instruction with appropriate flags when used for
   /// addressing some offset of an object. i.e. if a load is split into 
multiple
-  /// components, create an add nuw from the base pointer to the offset.
+  /// components, create an add nuw inbounds from the base pointer to the
+  /// offset.
   SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, TypeSize Offset) {
-return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap);
+return getMemBasePlusOffset(
+Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds);
   }
 
   SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, SDValue Offset) {
 // The object itself can't wrap around the address space, so it shouldn't 
be
 // possible for the adds of the offsets to the split parts to overflow.
-return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap);
+return getMemBasePlusOffset(
+Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds);
   }
 
   /// Return a new CALLSEQ_START node, that starts new call fram

[llvm-branch-commits] [clang] release/20.x: [clang] Do not infer lifetimebound for functions with void return type (#131997) (PR #133997)

2025-04-05 Thread Aaron Ballman via llvm-branch-commits

https://github.com/AaronBallman approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/133997
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [BPF] Add default cpu change in ReleaseNotes (PR #131691)

2025-04-05 Thread via llvm-branch-commits

https://github.com/yonghong-song updated 
https://github.com/llvm/llvm-project/pull/131691

>From 70d891fcda64891e21129d6cc843ffca073fa255 Mon Sep 17 00:00:00 2001
From: Yonghong Song 
Date: Mon, 17 Mar 2025 15:54:25 -0700
Subject: [PATCH] [BPF] Add default cpu change in ReleaseNotes

The pull request [1] changed bpf default cpu from -mcpu=v1 to
-mcpu=v3 in clang20. Recently in [1], Yuval Deutscher suggested
to add an entry to clang20 ReleaseNotes so users can easily find
the change from documentation.

  [1] https://github.com/llvm/llvm-project/pull/107008
---
 clang/docs/ReleaseNotes.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 02292c10e6964..a0e0128bcee2a 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1299,6 +1299,11 @@ AVR Support
 
 - Reject C/C++ compilation for avr1 devices which have no SRAM.
 
+BPF Support
+^^^
+
+- Make ``-mcpu=v3`` as the default.
+
 DWARF Support in Clang
 --
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-04-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/131899

>From 56106534f70a4be70f9edea3c6f631e286ac6340 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 18 Mar 2025 21:32:11 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect non-protected indirect
 calls

---
 bolt/include/bolt/Core/MCPlusBuilder.h|  10 +
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  33 +-
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  42 ++
 .../binary-analysis/AArch64/gs-pauth-calls.s  | 676 ++
 4 files changed, 757 insertions(+), 4 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-calls.s

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 76ea2489e7038..b3d54ccd5955d 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -577,6 +577,16 @@ class MCPlusBuilder {
 return getNoRegister();
   }
 
+  /// Returns the register used as call destination, or no-register, if not
+  /// an indirect call. Sets IsAuthenticatedInternally if the instruction
+  /// accepts signed pointer as its operand and authenticates it internally.
+  virtual MCPhysReg
+  getRegUsedAsCallDest(const MCInst &Inst,
+   bool &IsAuthenticatedInternally) const {
+llvm_unreachable("not implemented");
+return getNoRegister();
+  }
+
   virtual bool isTerminator(const MCInst &Inst) const;
 
   virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index c81a586b02771..b8a0a80215ce2 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -382,11 +382,11 @@ class PacRetAnalysis
 
 public:
   std::vector
-  getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF,
- const ArrayRef UsedDirtyRegs) const {
+  getLastClobberingInsts(const MCInst &Inst, BinaryFunction &BF,
+ const ArrayRef UsedDirtyRegs) {
 if (RegsToTrackInstsFor.empty())
   return {};
-auto MaybeState = getStateAt(Ret);
+auto MaybeState = getStateBefore(Inst);
 if (!MaybeState)
   llvm_unreachable("Expected State to be present");
 const State &S = *MaybeState;
@@ -434,6 +434,29 @@ static std::shared_ptr tryCheckReturn(const 
BinaryContext &BC,
   return std::make_shared(RetKind, Inst, RetReg);
 }
 
+static std::shared_ptr tryCheckCall(const BinaryContext &BC,
+const MCInstReference &Inst,
+const State &S) {
+  static const GadgetKind CallKind("non-protected call found");
+  if (!BC.MIB->isCall(Inst) && !BC.MIB->isBranch(Inst))
+return nullptr;
+
+  bool IsAuthenticated = false;
+  MCPhysReg DestReg = BC.MIB->getRegUsedAsCallDest(Inst, IsAuthenticated);
+  if (IsAuthenticated || DestReg == BC.MIB->getNoRegister())
+return nullptr;
+
+  LLVM_DEBUG({
+traceInst(BC, "Found call inst", Inst);
+traceReg(BC, "Call destination reg", DestReg);
+traceRegMask(BC, "SafeToDerefRegs", S.SafeToDerefRegs);
+  });
+  if (S.SafeToDerefRegs[DestReg])
+return nullptr;
+
+  return std::make_shared(CallKind, Inst, DestReg);
+}
+
 FunctionAnalysisResult
 Analysis::computeDfState(BinaryFunction &BF,
  MCPlusBuilder::AllocatorIdTy AllocatorId) {
@@ -450,10 +473,12 @@ Analysis::computeDfState(BinaryFunction &BF,
   for (BinaryBasicBlock &BB : BF) {
 for (int64_t I = 0, E = BB.size(); I < E; ++I) {
   MCInstReference Inst(&BB, I);
-  const State &S = *PRA.getStateAt(Inst);
+  const State &S = *PRA.getStateBefore(Inst);
 
   if (auto Report = tryCheckReturn(BC, Inst, S))
 Result.Diagnostics.push_back(Report);
+  if (auto Report = tryCheckCall(BC, Inst, S))
+Result.Diagnostics.push_back(Report);
 }
   }
 
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp 
b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index d238a1df5c7d7..9ce1514639f95 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -277,6 +277,48 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  MCPhysReg
+  getRegUsedAsCallDest(const MCInst &Inst,
+   bool &IsAuthenticatedInternally) const override {
+assert(isCall(Inst) || isBranch(Inst));
+IsAuthenticatedInternally = false;
+
+switch (Inst.getOpcode()) {
+case AArch64::B:
+case AArch64::BL:
+  assert(Inst.getOperand(0).isExpr());
+  return getNoRegister();
+case AArch64::Bcc:
+case AArch64::CBNZW:
+case AArch64::CBNZX:
+case AArch64::CBZW:
+case AArch64::CBZX:
+  assert(Inst.getOperand(1).isExpr());
+  return getNoRegister();
+case AArch64::TBNZW:
+case AArch64::TBNZX:
+case AArch64::TBZW:
+case AArch64::TBZX:
+  assert(Ins

[llvm-branch-commits] [clang] [Driver][RISCV] Integrate RISCV target in baremetal toolchain object and deprecate RISCVToolchain object.(3/3) (PR #121831)

2025-04-05 Thread Garvit Gupta via llvm-branch-commits

https://github.com/quic-garvgupt edited 
https://github.com/llvm/llvm-project/pull/121831
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GlobalISel] Combine redundant sext_inreg (PR #131624)

2025-04-05 Thread Pierre van Houtryve via llvm-branch-commits

https://github.com/Pierre-vh updated 
https://github.com/llvm/llvm-project/pull/131624

>From 3f3c67934d0c9ea34c11cbd24becc24541baf567 Mon Sep 17 00:00:00 2001
From: pvanhout 
Date: Mon, 17 Mar 2025 13:54:59 +0100
Subject: [PATCH 1/2] [GlobalISel] Combine redundant sext_inreg

---
 .../llvm/CodeGen/GlobalISel/CombinerHelper.h  |   3 +
 .../include/llvm/Target/GlobalISel/Combine.td |   9 +-
 .../GlobalISel/CombinerHelperCasts.cpp|  27 +++
 .../combine-redundant-sext-inreg.mir  | 164 ++
 .../combine-sext-trunc-sextinreg.mir  |  87 ++
 .../CodeGen/AMDGPU/GlobalISel/llvm.abs.ll |   5 -
 6 files changed, 289 insertions(+), 6 deletions(-)
 create mode 100644 
llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir
 create mode 100644 
llvm/test/CodeGen/AMDGPU/GlobalISel/combine-sext-trunc-sextinreg.mir

diff --git a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h 
b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
index 9b78342c8fc39..5778377d125a8 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
@@ -994,6 +994,9 @@ class CombinerHelper {
   // overflow sub
   bool matchSuboCarryOut(const MachineInstr &MI, BuildFnTy &MatchInfo) const;
 
+  // (sext_inreg (sext_inreg x, K0), K1)
+  void applyRedundantSextInReg(MachineInstr &Root, MachineInstr &Other) const;
+
 private:
   /// Checks for legality of an indexed variant of \p LdSt.
   bool isIndexedLoadStoreLegal(GLoadStore &LdSt) const;
diff --git a/llvm/include/llvm/Target/GlobalISel/Combine.td 
b/llvm/include/llvm/Target/GlobalISel/Combine.td
index 660b03080f92e..6a0ff683a4647 100644
--- a/llvm/include/llvm/Target/GlobalISel/Combine.td
+++ b/llvm/include/llvm/Target/GlobalISel/Combine.td
@@ -1849,6 +1849,12 @@ def anyext_of_anyext : ext_of_ext_opcodes;
 def anyext_of_zext : ext_of_ext_opcodes;
 def anyext_of_sext : ext_of_ext_opcodes;
 
+def sext_inreg_of_sext_inreg : GICombineRule<
+   (defs root:$dst),
+   (match (G_SEXT_INREG $x, $src, $a):$other,
+  (G_SEXT_INREG $dst, $x, $b):$root),
+   (apply [{ Helper.applyRedundantSextInReg(*${root}, *${other}); }])>;
+
 // Push cast through build vector.
 class buildvector_of_opcode : GICombineRule <
   (defs root:$root, build_fn_matchinfo:$matchinfo),
@@ -1896,7 +1902,8 @@ def cast_of_cast_combines: GICombineGroup<[
   sext_of_anyext,
   anyext_of_anyext,
   anyext_of_zext,
-  anyext_of_sext
+  anyext_of_sext,
+  sext_inreg_of_sext_inreg,
 ]>;
 
 def cast_combines: GICombineGroup<[
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp 
b/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp
index 576fd5fd81703..883a62c308232 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp
@@ -378,3 +378,30 @@ bool CombinerHelper::matchCastOfInteger(const MachineInstr 
&CastMI,
 return false;
   }
 }
+
+void CombinerHelper::applyRedundantSextInReg(MachineInstr &Root,
+ MachineInstr &Other) const {
+  assert(Root.getOpcode() == TargetOpcode::G_SEXT_INREG &&
+ Other.getOpcode() == TargetOpcode::G_SEXT_INREG);
+
+  unsigned RootWidth = Root.getOperand(2).getImm();
+  unsigned OtherWidth = Other.getOperand(2).getImm();
+
+  Register Dst = Root.getOperand(0).getReg();
+  Register OtherDst = Other.getOperand(0).getReg();
+  Register Src = Other.getOperand(1).getReg();
+
+  if (RootWidth >= OtherWidth) {
+// The root sext_inreg is entirely redundant because the other one
+// is narrower.
+Observer.changingAllUsesOfReg(MRI, Dst);
+MRI.replaceRegWith(Dst, OtherDst);
+Observer.finishedChangingAllUsesOfReg();
+  } else {
+// RootWidth < OtherWidth, rewrite this G_SEXT_INREG with the source of the
+// other G_SEXT_INREG.
+Builder.buildSExtInReg(Dst, Src, RootWidth);
+  }
+
+  Root.eraseFromParent();
+}
diff --git 
a/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir 
b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir
new file mode 100644
index 0..566ee8e6c338d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir
@@ -0,0 +1,164 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 
-run-pass=amdgpu-regbank-combiner -verify-machineinstrs %s -o - | FileCheck %s
+
+---
+name: inreg8_inreg16
+tracksRegLiveness: true
+body: |
+  bb.0:
+liveins: $vgpr0
+; CHECK-LABEL: name: inreg8_inreg16
+; CHECK: liveins: $vgpr0
+; CHECK-NEXT: {{  $}}
+; CHECK-NEXT: %copy:_(s32) = COPY $vgpr0
+; CHECK-NEXT: %inreg:_(s32) = G_SEXT_INREG %copy, 8
+; CHECK-NEXT: $vgpr0 = COPY %inreg(s32)
+%copy:_(s32) = COPY $vgpr0
+%inreg:_(s32) = G_SEXT_INREG %copy, 8
+%inreg1:_(s32) = G_SEXT_INREG %inreg, 16
+$vgpr0 = COPY %inreg1
+...
+
+

[llvm-branch-commits] [clang] [Driver] Add option to force undefined symbols during linking in BareMetal toolchain object. (PR #132807)

2025-04-05 Thread Garvit Gupta via llvm-branch-commits


@@ -0,0 +1,15 @@
+// Check the arguments are correctly passed

quic-garvgupt wrote:

baremetal-ld.c is for testing LTO related tests so not clobbering it. Have 
renamed the tests as baremetal-undefined-symbols.c

https://github.com/llvm/llvm-project/pull/132807
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxxabi] [release/18.x][backport][libc++abi] Use __has_feature check to enable usage of thread_local for exception storage (PR #132241)

2025-04-05 Thread Louis Dionne via llvm-branch-commits

ldionne wrote:

The LLVM 18 release has been done for a long time, we're working on LLVM 20 now.

https://github.com/llvm/llvm-project/pull/132241
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-04-05 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff ee0ee253d617aa4cddfe5216f93365645579b54d 
3bdbe711b5f937d564e1883ec94e1c5ecbd87750 --extensions ,cpp,h -- 
clang/test/CodeGen/pfp-attribute-disable.cpp 
clang/test/CodeGen/pfp-load-store.cpp clang/test/CodeGen/pfp-memcpy.cpp 
clang/test/CodeGen/pfp-null-init.cpp clang/test/CodeGen/pfp-struct-gep.cpp 
clang/include/clang/AST/ASTContext.h clang/include/clang/Basic/LangOptions.h 
clang/lib/AST/ASTContext.cpp clang/lib/AST/ExprConstant.cpp 
clang/lib/AST/Type.cpp clang/lib/AST/TypePrinter.cpp 
clang/lib/CodeGen/CGCall.cpp clang/lib/CodeGen/CGClass.cpp 
clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CGExprAgg.cpp 
clang/lib/CodeGen/CGExprCXX.cpp clang/lib/CodeGen/CGExprConstant.cpp 
clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenFunction.h 
clang/lib/CodeGen/CodeGenModule.cpp clang/lib/CodeGen/CodeGenModule.h 
clang/lib/CodeGen/ItaniumCXXABI.cpp clang/lib/CodeGen/MicrosoftCXXABI.cpp 
clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Sema/SemaDeclAttr.cpp 
clang/lib/Sema/SemaExprCXX.cpp clang/test/CodeGenCXX/trivial_abi.cpp 
libcxx/include/__config libcxx/include/__functional/function.h 
libcxx/include/__memory/shared_ptr.h libcxx/include/__memory/unique_ptr.h 
libcxx/include/__tree libcxx/include/__type_traits/is_trivially_relocatable.h 
libcxx/include/__vector/vector.h libcxx/include/typeinfo 
libcxx/test/libcxx/gdb/gdb_pretty_printer_test.sh.cpp 
libcxxabi/include/__cxxabi_config.h libcxxabi/src/private_typeinfo.h 
llvm/include/llvm/Analysis/PtrUseVisitor.h 
llvm/include/llvm/Transforms/Utils/Local.h llvm/lib/Analysis/PtrUseVisitor.cpp 
llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp 
llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp 
llvm/lib/Transforms/Scalar/SROA.cpp llvm/lib/Transforms/Utils/SimplifyCFG.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 9d824231d0..b4ac45a920 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1299,8 +1299,7 @@ static llvm::Value *CoerceIntOrPtrToIntOrPtr(llvm::Value 
*Val,
 /// destination type; in this situation the values of bits which not
 /// present in the src are undefined.
 static llvm::Value *CreateCoercedLoad(Address Src, QualType SrcFETy,
-  llvm::Type *Ty,
-  CodeGenFunction &CGF) {
+  llvm::Type *Ty, CodeGenFunction &CGF) {
   llvm::Type *SrcTy = Src.getElementType();
 
   // If SrcTy and Ty are the same, just do a load.
@@ -1344,7 +1343,8 @@ static llvm::Value *CreateCoercedLoad(Address Src, 
QualType SrcFETy,
   CharUnits Offset = CharUnits::Zero();
   llvm::Value *Val = llvm::UndefValue::get(AT);
   for (unsigned i = 0; i != AT->getNumElements(); ++i, Offset += wordSize)
-Val = CGF.Builder.CreateInsertValue(Val, LoadCoercedField(Offset, ET), 
i);
+Val =
+CGF.Builder.CreateInsertValue(Val, LoadCoercedField(Offset, ET), 
i);
   return Val;
 }
 auto *ST = cast(Ty);
@@ -1426,10 +1426,8 @@ static llvm::Value *CreateCoercedLoad(Address Src, 
QualType SrcFETy,
   return CGF.Builder.CreateLoad(Tmp);
 }
 
-void CodeGenFunction::CreateCoercedStore(llvm::Value *Src,
- QualType SrcFETy,
- Address Dst,
- llvm::TypeSize DstSize,
+void CodeGenFunction::CreateCoercedStore(llvm::Value *Src, QualType SrcFETy,
+ Address Dst, llvm::TypeSize DstSize,
  bool DstIsVolatile) {
   if (!DstSize)
 return;
@@ -4119,8 +4117,7 @@ void CodeGenFunction::EmitFunctionEpilog(const 
CGFunctionInfo &FI,
 
   auto eltAddr = Builder.CreateStructGEP(addr, i);
   llvm::Value *elt = CreateCoercedLoad(
-  eltAddr,
-  RetTy,
+  eltAddr, RetTy,
   unpaddedStruct ? unpaddedStruct->getElementType(unpaddedIndex++)
  : unpaddedCoercionType,
   *this);
@@ -5711,8 +5708,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 if (ABIArgInfo::isPaddingForCoerceAndExpand(eltType)) continue;
 Address eltAddr = Builder.CreateStructGEP(addr, i);
 llvm::Value *elt = CreateCoercedLoad(
-eltAddr,
-I->Ty,
+eltAddr, I->Ty,
 unpaddedStruct ? unpaddedStruct->getElementType(unpaddedIndex++)
: unpaddedCoercionType,
 *this);
diff --git a/clang/lib/CodeGen/CGClass.cpp b/clang/lib/CodeGen/CGClass.cpp
index ae1d78baed..9d3784cf63 100644
--- a/clang/lib/CodeGen/CGClass.cpp
+++ b/clang/lib/CodeGen/CGClass.cpp
@@ -672,7 +67

[llvm-branch-commits] [llvm] release/20.x: [X86][AVX10.2] Include changes for COMX and VGETEXP from rev. 2 (#132824) (PR #132932)

2025-04-05 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/132932
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [lld][LoongArch] Convert TLS IE to LE in the normal or medium code model (PR #123680)

2025-04-05 Thread Zhaoxin Yang via llvm-branch-commits

https://github.com/ylzsx updated 
https://github.com/llvm/llvm-project/pull/123680

>From 0f580567169ffbf1546a5389ab4b9f7d1fc07c71 Mon Sep 17 00:00:00 2001
From: yangzhaoxin 
Date: Thu, 2 Jan 2025 20:58:56 +0800
Subject: [PATCH 1/6] Convert TLS IE to LE in the normal or medium code model.

Original code sequence:
 * pcalau12i $a0, %ie_pc_hi20(sym)
 * ld.d  $a0, $a0, %ie_pc_lo12(sym)

The code sequence converted is as follows:
 * lu12i.w   $a0, %ie_pc_hi20(sym)  # le_hi20 != 0, otherwise NOP
 * ori $a0   $a0, %ie_pc_lo12(sym)

FIXME: When relaxation enables, redundant NOP can be removed. This will
be implemented in a future patch.

Note: In the normal or medium code model, original code sequence with
relocations can appear interleaved, because converted code sequence
calculates the absolute offset. However, in extreme code model, to
identify the current code model, the first four instructions with
relocations must appear consecutively.
---
 lld/ELF/Arch/LoongArch.cpp | 87 ++
 lld/ELF/Relocations.cpp| 15 ++-
 2 files changed, 101 insertions(+), 1 deletion(-)

diff --git a/lld/ELF/Arch/LoongArch.cpp b/lld/ELF/Arch/LoongArch.cpp
index 4edc625b05cb0..f9a22a7bd5218 100644
--- a/lld/ELF/Arch/LoongArch.cpp
+++ b/lld/ELF/Arch/LoongArch.cpp
@@ -39,7 +39,11 @@ class LoongArch final : public TargetInfo {
   void relocate(uint8_t *loc, const Relocation &rel,
 uint64_t val) const override;
   bool relaxOnce(int pass) const override;
+  void relocateAlloc(InputSectionBase &sec, uint8_t *buf) const override;
   void finalizeRelax(int passes) const override;
+
+private:
+  void tlsIeToLe(uint8_t *loc, const Relocation &rel, uint64_t val) const;
 };
 } // end anonymous namespace
 
@@ -53,6 +57,8 @@ enum Op {
   ADDI_W = 0x0280,
   ADDI_D = 0x02c0,
   ANDI = 0x0340,
+  ORI = 0x0380,
+  LU12I_W = 0x1400,
   PCADDI = 0x1800,
   PCADDU12I = 0x1c00,
   LD_W = 0x2880,
@@ -1002,6 +1008,87 @@ static bool relax(Ctx &ctx, InputSection &sec) {
   return changed;
 }
 
+// Convert TLS IE to LE in the normal or medium code model.
+// Original code sequence:
+//  * pcalau12i $a0, %ie_pc_hi20(sym)
+//  * ld.d  $a0, $a0, %ie_pc_lo12(sym)
+//
+// The code sequence converted is as follows:
+//  * lu12i.w   $a0, %le_hi20(sym)  # le_hi20 != 0, otherwise NOP
+//  * ori $a0   $a0, %le_lo12(sym)
+//
+// When relaxation enables, redundant NOPs can be removed.
+void LoongArch::tlsIeToLe(uint8_t *loc, const Relocation &rel,
+  uint64_t val) const {
+  assert(isInt<32>(val) &&
+ "val exceeds the range of medium code model in tlsIeToLe");
+
+  bool isUInt12 = isUInt<12>(val);
+  const uint32_t currInsn = read32le(loc);
+  switch (rel.type) {
+  case R_LARCH_TLS_IE_PC_HI20:
+if (isUInt12)
+  write32le(loc, insn(ANDI, R_ZERO, R_ZERO, 0)); // nop
+else
+  write32le(loc, insn(LU12I_W, getD5(currInsn), extractBits(val, 31, 12),
+  0)); // lu12i.w $a0, %le_hi20
+break;
+  case R_LARCH_TLS_IE_PC_LO12:
+if (isUInt12)
+  write32le(loc, insn(ORI, getD5(currInsn), R_ZERO,
+  val)); // ori $a0, $r0, %le_lo12
+else
+  write32le(loc, insn(ORI, getD5(currInsn), getJ5(currInsn),
+  lo12(val))); // ori $a0, $a0, %le_lo12
+break;
+  }
+}
+
+void LoongArch::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const {
+  const unsigned bits = ctx.arg.is64 ? 64 : 32;
+  uint64_t secAddr = sec.getOutputSection()->addr;
+  if (auto *s = dyn_cast(&sec))
+secAddr += s->outSecOff;
+  else if (auto *ehIn = dyn_cast(&sec))
+secAddr += ehIn->getParent()->outSecOff;
+  bool isExtreme = false;
+  const MutableArrayRef relocs = sec.relocs();
+  for (size_t i = 0, size = relocs.size(); i != size; ++i) {
+Relocation &rel = relocs[i];
+uint8_t *loc = buf + rel.offset;
+uint64_t val = SignExtend64(
+sec.getRelocTargetVA(ctx, rel, secAddr + rel.offset), bits);
+
+switch (rel.expr) {
+case R_RELAX_HINT:
+  continue;
+case R_RELAX_TLS_IE_TO_LE:
+  if (rel.type == R_LARCH_TLS_IE_PC_HI20) {
+// LoongArch does not support IE to LE optimize in the extreme code
+// model. In this case, the relocs are as follows:
+//
+//  * i   -- R_LARCH_TLS_IE_PC_HI20
+//  * i+1 -- R_LARCH_TLS_IE_PC_LO12
+//  * i+2 -- R_LARCH_TLS_IE64_PC_LO20
+//  * i+3 -- R_LARCH_TLS_IE64_PC_HI12
+isExtreme =
+(i + 2 < size && relocs[i + 2].type == R_LARCH_TLS_IE64_PC_LO20);
+  }
+  if (isExtreme) {
+rel.expr = getRelExpr(rel.type, *rel.sym, loc);
+val = SignExtend64(sec.getRelocTargetVA(ctx, rel, secAddr + 
rel.offset),
+   bits);
+relocateNoSym(loc, rel.type, val);
+  } else
+tlsIeToLe(loc, rel, val);
+  continue;
+default:
+  break;
+}
+relocate(loc, rel, val);
+  }
+}
+
 // Wh

[llvm-branch-commits] [llvm] llvm-reduce: Reduce global variable code model (PR #133865)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/133865?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#133865** https://app.graphite.dev/github/pr/llvm/llvm-project/133865?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/133865?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#133859** https://app.graphite.dev/github/pr/llvm/llvm-project/133859?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/133865
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 244daaf - Revert "[GlobalOpt] Handle operators separately when removing GV users (#84694)"

2025-04-05 Thread via llvm-branch-commits

Author: Eli Friedman
Date: 2025-03-25T11:17:45-07:00
New Revision: 244daaf51c11c04971f6adb144bbfacf4074b1a8

URL: 
https://github.com/llvm/llvm-project/commit/244daaf51c11c04971f6adb144bbfacf4074b1a8
DIFF: 
https://github.com/llvm/llvm-project/commit/244daaf51c11c04971f6adb144bbfacf4074b1a8.diff

LOG: Revert "[GlobalOpt] Handle operators separately when removing GV users 
(#84694)"

This reverts commit 51dad714e82e3e15c339aade8be605ed09bbabab.

Added: 


Modified: 
llvm/lib/Transforms/IPO/GlobalOpt.cpp
llvm/test/Transforms/GlobalOpt/cleanup-pointer-root-users-gep-constexpr.ll
llvm/test/Transforms/GlobalOpt/dead-store-status.ll
llvm/test/Transforms/GlobalOpt/pr54572.ll

Removed: 




diff  --git a/llvm/lib/Transforms/IPO/GlobalOpt.cpp 
b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
index 7b7b3802d7a77..2d046f09f1b2b 100644
--- a/llvm/lib/Transforms/IPO/GlobalOpt.cpp
+++ b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
@@ -114,6 +114,55 @@ static cl::opt ColdCCRelFreq(
 "entry frequency, for a call site to be considered cold for enabling "
 "coldcc"));
 
+/// Is this global variable possibly used by a leak checker as a root?  If so,
+/// we might not really want to eliminate the stores to it.
+static bool isLeakCheckerRoot(GlobalVariable *GV) {
+  // A global variable is a root if it is a pointer, or could plausibly contain
+  // a pointer.  There are two challenges; one is that we could have a struct
+  // the has an inner member which is a pointer.  We recurse through the type 
to
+  // detect these (up to a point).  The other is that we may actually be a 
union
+  // of a pointer and another type, and so our LLVM type is an integer which
+  // gets converted into a pointer, or our type is an [i8 x #] with a pointer
+  // potentially contained here.
+
+  if (GV->hasPrivateLinkage())
+return false;
+
+  SmallVector Types;
+  Types.push_back(GV->getValueType());
+
+  unsigned Limit = 20;
+  do {
+Type *Ty = Types.pop_back_val();
+switch (Ty->getTypeID()) {
+  default: break;
+  case Type::PointerTyID:
+return true;
+  case Type::FixedVectorTyID:
+  case Type::ScalableVectorTyID:
+if (cast(Ty)->getElementType()->isPointerTy())
+  return true;
+break;
+  case Type::ArrayTyID:
+Types.push_back(cast(Ty)->getElementType());
+break;
+  case Type::StructTyID: {
+StructType *STy = cast(Ty);
+if (STy->isOpaque()) return true;
+for (Type *InnerTy : STy->elements()) {
+  if (isa(InnerTy)) return true;
+  if (isa(InnerTy) || isa(InnerTy) ||
+  isa(InnerTy))
+Types.push_back(InnerTy);
+}
+break;
+  }
+}
+if (--Limit == 0) return true;
+  } while (!Types.empty());
+  return false;
+}
+
 /// Given a value that is stored to a global but never read, determine whether
 /// it's safe to remove the store and the chain of computation that feeds the
 /// store.
@@ -122,7 +171,7 @@ static bool IsSafeComputationToRemove(
   do {
 if (isa(V))
   return true;
-if (V->hasNUsesOrMore(1))
+if (!V->hasOneUse())
   return false;
 if (isa(V) || isa(V) || isa(V) ||
 isa(V))
@@ -144,12 +193,90 @@ static bool IsSafeComputationToRemove(
   } while (true);
 }
 
+/// This GV is a pointer root.  Loop over all users of the global and clean up
+/// any that obviously don't assign the global a value that isn't dynamically
+/// allocated.
+static bool
+CleanupPointerRootUsers(GlobalVariable *GV,
+function_ref GetTLI) {
+  // A brief explanation of leak checkers.  The goal is to find bugs where
+  // pointers are forgotten, causing an accumulating growth in memory
+  // usage over time.  The common strategy for leak checkers is to explicitly
+  // allow the memory pointed to by globals at exit.  This is popular because 
it
+  // also solves another problem where the main thread of a C++ program may 
shut
+  // down before other threads that are still expecting to use those globals. 
To
+  // handle that case, we expect the program may create a singleton and never
+  // destroy it.
+
+  bool Changed = false;
+
+  // If Dead[n].first is the only use of a malloc result, we can delete its
+  // chain of computation and the store to the global in Dead[n].second.
+  SmallVector, 32> Dead;
+
+  SmallVector Worklist(GV->users());
+  // Constants can't be pointers to dynamically allocated memory.
+  while (!Worklist.empty()) {
+User *U = Worklist.pop_back_val();
+if (StoreInst *SI = dyn_cast(U)) {
+  Value *V = SI->getValueOperand();
+  if (isa(V)) {
+Changed = true;
+SI->eraseFromParent();
+  } else if (Instruction *I = dyn_cast(V)) {
+if (I->hasOneUse())
+  Dead.push_back(std::make_pair(I, SI));
+  }
+} else if (MemSetInst *MSI = dyn_cast(U)) {
+  if (isa(MSI->getV

[llvm-branch-commits] [llvm] llvm-reduce: Defer a shouldKeep call in operand reduction (PR #133387)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/133387

>From fa597dd4161693813a3566fd1d4a3c7df1d00746 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 28 Mar 2025 12:58:20 +0700
Subject: [PATCH] llvm-reduce: Defer a shouldKeep call in operand reduction

Ideally shouldKeep is only called in contexts that will successfully
do something.
---
 llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp 
b/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp
index b0bca015434fa..8b6446725b7d4 100644
--- a/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp
+++ b/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp
@@ -26,8 +26,8 @@ extractOperandsFromModule(Oracle &O, ReducerWorkItem 
&WorkItem,
 for (auto &I : instructions(&F)) {
   if (PHINode *Phi = dyn_cast(&I)) {
 for (auto &Op : Phi->incoming_values()) {
-  if (!O.shouldKeep()) {
-if (Value *Reduced = ReduceValue(Op))
+  if (Value *Reduced = ReduceValue(Op)) {
+if (!O.shouldKeep())
   Phi->setIncomingValueForBlock(Phi->getIncomingBlock(Op), 
Reduced);
   }
 }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms after duplication for threading (PR #133484)

2025-04-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Orlando Cazalet-Hyams (OCHyams)


Changes

Given the same branch condition in `a` and `c` SimplifyCFG converts:

  +> b -+
  | v
  --> a --> c --> e -->
| ^
+> d -+

into:

  +--> bcd ---+
  |   v
  --> a --> c --> e -->

Remap source atoms on instructions duplicated from `c` into `bcd`.

---
Full diff: https://github.com/llvm/llvm-project/pull/133484.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+6-6) 
- (added) llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll 
(+62) 


``diff
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp 
b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 1ba1e4ac81000..c83ff0260e297 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -3589,7 +3589,7 @@ foldCondBranchOnValueKnownInPredecessorImpl(BranchInst 
*BI, DomTreeUpdater *DTU,
 // instructions into EdgeBB.  We know that there will be no uses of the
 // cloned instructions outside of EdgeBB.
 BasicBlock::iterator InsertPt = EdgeBB->getFirstInsertionPt();
-DenseMap TranslateMap; // Track translated values.
+ValueToValueMapTy TranslateMap; // Track translated values.
 TranslateMap[Cond] = CB;
 
 // RemoveDIs: track instructions that we optimise away while folding, so
@@ -3609,11 +3609,11 @@ foldCondBranchOnValueKnownInPredecessorImpl(BranchInst 
*BI, DomTreeUpdater *DTU,
 N->setName(BBI->getName() + ".c");
 
   // Update operands due to translation.
-  for (Use &Op : N->operands()) {
-DenseMap::iterator PI = TranslateMap.find(Op);
-if (PI != TranslateMap.end())
-  Op = PI->second;
-  }
+  // Key Instructions: Remap all the atom groups.
+  if (const DebugLoc &DL = BBI->getDebugLoc())
+mapAtomInstance(DL, TranslateMap);
+  RemapInstruction(N, TranslateMap,
+   RF_IgnoreMissingLocals | RF_NoModuleLevelChanges);
 
   // Check for trivial simplification.
   if (Value *V = simplifyInstruction(N, {DL, nullptr, nullptr, AC})) {
diff --git 
a/llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll 
b/llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll
new file mode 100644
index 0..f8477600c6418
--- /dev/null
+++ b/llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll
@@ -0,0 +1,62 @@
+; RUN: opt %s -passes=simplifycfg -simplifycfg-require-and-preserve-domtree=1 
-S \
+; RUN: | FileCheck %s
+
+;; Generated using:
+;;   opt -passes=debugify --debugify-atoms --debugify-level=locations \
+;;  llvm/test/Transforms/SimplifyCFG/debug-info-thread-phi.ll
+;; With unused/untested metadata nodes removed.
+
+;; Check the duplicated store gets distinct atom info in each branch.
+
+; CHECK-LABEL: @bar(
+; CHECK: if.then:
+; CHECK:   store i32 1{{.*}}, !dbg [[DBG1:!.*]]
+; CHECK: if.end.1.critedge:
+; CHECK:   store i32 1{{.*}}, !dbg [[DBG2:!.*]]
+; CHECK: [[DBG1]] = !DILocation(line: 1{{.*}}, atomGroup: 1
+; CHECK: [[DBG2]] = !DILocation(line: 1{{.*}}, atomGroup: 2
+
+define void @bar(i32 %aa) !dbg !5 {
+entry:
+  %aa.addr = alloca i32, align 4
+  %bb = alloca i32, align 4
+  store i32 %aa, ptr %aa.addr, align 4
+  store i32 0, ptr %bb, align 4
+  %tobool = icmp ne i32 %aa, 0
+  br i1 %tobool, label %if.then, label %if.end
+
+if.then:  ; preds = %entry
+  call void @foo()
+  br label %if.end
+
+if.end:   ; preds = %if.then, %entry
+  store i32 1, ptr %bb, align 4, !dbg !8
+  br i1 %tobool, label %if.then.1, label %if.end.1
+
+if.then.1:; preds = %if.end
+  call void @foo()
+  br label %if.end.1
+
+if.end.1: ; preds = %if.then.1, %if.end
+  store i32 2, ptr %bb, align 4
+  br label %for.end
+
+for.end:  ; preds = %if.end.1
+  ret void
+}
+
+declare void @foo()
+
+!llvm.dbg.cu = !{!0}
+!llvm.debugify = !{!2, !3}
+!llvm.module.flags = !{!4}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: 
"debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
+!1 = !DIFile(filename: 
"llvm/test/Transforms/SimplifyCFG/debug-info-thread-phi.ll", directory: "/")
+!2 = !{i32 15}
+!3 = !{i32 0}
+!4 = !{i32 2, !"Debug Info Version", i32 3}
+!5 = distinct !DISubprogram(name: "bar", linkageName: "bar", scope: null, 
file: !1, line: 1, type: !6, scopeLine: 1, spFlags: DISPFlagDefinition | 
DISPFlagOptimized, unit: !0)
+!6 = !DISubroutineType(types: !7)
+!7 = !{}
+!8 = !DILocation(line: 1, column: 1, scope: !5, atomGroup: 1, atomRank: 1)

``




https://github.com/llvm/llvm-project/pull/133484
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https

[llvm-branch-commits] [lldb] release/20.x: [lldb] Respect LaunchInfo::SetExecutable in ProcessLauncherPosixFork (#133093) (PR #134079)

2025-04-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-lldb

Author: None (llvmbot)


Changes

Backport 39e7efe1e4304544289d8d1b45f4d04d11b4a791

Requested by: @DavidSpickett

---
Full diff: https://github.com/llvm/llvm-project/pull/134079.diff


2 Files Affected:

- (modified) lldb/source/Host/posix/ProcessLauncherPosixFork.cpp (+6-2) 
- (modified) lldb/unittests/Host/HostTest.cpp (+42-1) 


``diff
diff --git a/lldb/source/Host/posix/ProcessLauncherPosixFork.cpp 
b/lldb/source/Host/posix/ProcessLauncherPosixFork.cpp
index 7d856954684c4..903b18b10976c 100644
--- a/lldb/source/Host/posix/ProcessLauncherPosixFork.cpp
+++ b/lldb/source/Host/posix/ProcessLauncherPosixFork.cpp
@@ -94,6 +94,7 @@ struct ForkLaunchInfo {
   bool debug;
   bool disable_aslr;
   std::string wd;
+  std::string executable;
   const char **argv;
   Environment::Envp envp;
   std::vector actions;
@@ -194,7 +195,8 @@ struct ForkLaunchInfo {
   }
 
   // Execute.  We should never return...
-  execve(info.argv[0], const_cast(info.argv), info.envp);
+  execve(info.executable.c_str(), const_cast(info.argv),
+ info.envp);
 
 #if defined(__linux__)
   if (errno == ETXTBSY) {
@@ -207,7 +209,8 @@ struct ForkLaunchInfo {
 // Since this state should clear up quickly, wait a while and then give it
 // one more go.
 usleep(5);
-execve(info.argv[0], const_cast(info.argv), info.envp);
+execve(info.executable.c_str(), const_cast(info.argv),
+   info.envp);
   }
 #endif
 
@@ -246,6 +249,7 @@ ForkLaunchInfo::ForkLaunchInfo(const ProcessLaunchInfo 
&info)
   debug(info.GetFlags().Test(eLaunchFlagDebug)),
   disable_aslr(info.GetFlags().Test(eLaunchFlagDisableASLR)),
   wd(info.GetWorkingDirectory().GetPath()),
+  executable(info.GetExecutableFile().GetPath()),
   argv(info.GetArguments().GetConstArgumentVector()),
   envp(FixupEnvironment(info.GetEnvironment())),
   actions(MakeForkActions(info)) {}
diff --git a/lldb/unittests/Host/HostTest.cpp b/lldb/unittests/Host/HostTest.cpp
index a1d8a3b7f485a..ed1df6de001ea 100644
--- a/lldb/unittests/Host/HostTest.cpp
+++ b/lldb/unittests/Host/HostTest.cpp
@@ -7,12 +7,24 @@
 
//===--===//
 
 #include "lldb/Host/Host.h"
+#include "TestingSupport/SubsystemRAII.h"
+#include "lldb/Host/FileSystem.h"
+#include "lldb/Host/ProcessLaunchInfo.h"
 #include "lldb/Utility/ProcessInfo.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Testing/Support/Error.h"
 #include "gtest/gtest.h"
+#include 
 
 using namespace lldb_private;
 using namespace llvm;
 
+// From TestMain.cpp.
+extern const char *TestMainArgv0;
+
+static cl::opt test_arg("test-arg");
+
 TEST(Host, WaitStatusFormat) {
   EXPECT_EQ("W01", formatv("{0:g}", WaitStatus{WaitStatus::Exit, 1}).str());
   EXPECT_EQ("X02", formatv("{0:g}", WaitStatus{WaitStatus::Signal, 2}).str());
@@ -45,4 +57,33 @@ TEST(Host, ProcessInstanceInfoCumulativeSystemTimeIsValid) {
   EXPECT_TRUE(info.CumulativeSystemTimeIsValid());
   info.SetCumulativeSystemTime(ProcessInstanceInfo::timespec{1, 0});
   EXPECT_TRUE(info.CumulativeSystemTimeIsValid());
-}
\ No newline at end of file
+}
+
+TEST(Host, LaunchProcessSetsArgv0) {
+  SubsystemRAII subsystems;
+
+  static constexpr StringLiteral TestArgv0 = "HelloArgv0";
+  if (test_arg != 0) {
+// In subprocess
+if (TestMainArgv0 != TestArgv0) {
+  errs() << formatv("Got '{0}' for argv[0]\n", TestMainArgv0);
+  exit(1);
+}
+exit(0);
+  }
+
+  ProcessLaunchInfo info;
+  info.SetExecutableFile(
+  FileSpec(llvm::sys::fs::getMainExecutable(TestMainArgv0, &test_arg)),
+  /*add_exe_file_as_first_arg=*/false);
+  info.GetArguments().AppendArgument("HelloArgv0");
+  info.GetArguments().AppendArgument(
+  "--gtest_filter=Host.LaunchProcessSetsArgv0");
+  info.GetArguments().AppendArgument("--test-arg=47");
+  std::promise exit_status;
+  info.SetMonitorProcessCallback([&](lldb::pid_t pid, int signal, int status) {
+exit_status.set_value(status);
+  });
+  ASSERT_THAT_ERROR(Host::LaunchProcess(info).takeError(), Succeeded());
+  ASSERT_THAT(exit_status.get_future().get(), 0);
+}

``




https://github.com/llvm/llvm-project/pull/134079
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxxabi] [release/18.x][backport][libc++abi] Use __has_feature check to enable usage of thread_local for exception storage (PR #132241)

2025-04-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-libcxxabi

Author: Bushev Dmitry (dybv-sc)


Changes

This is a backport of original commit #97591 to 18.x release.

---
Full diff: https://github.com/llvm/llvm-project/pull/132241.diff


1 Files Affected:

- (modified) libcxxabi/src/cxa_exception_storage.cpp (+1-1) 


``diff
diff --git a/libcxxabi/src/cxa_exception_storage.cpp 
b/libcxxabi/src/cxa_exception_storage.cpp
index 3a3233a1b9272..83408c904e1f7 100644
--- a/libcxxabi/src/cxa_exception_storage.cpp
+++ b/libcxxabi/src/cxa_exception_storage.cpp
@@ -24,7 +24,7 @@ extern "C" {
 } // extern "C"
 } // namespace __cxxabiv1
 
-#elif defined(HAS_THREAD_LOCAL)
+#elif __has_feature(cxx_thread_local)
 
 namespace __cxxabiv1 {
 namespace {

``




https://github.com/llvm/llvm-project/pull/132241
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-04-05 Thread Anatoly Trosinenko via llvm-branch-commits

atrosinenko wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/131899?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#131899** https://app.graphite.dev/github/pr/llvm/llvm-project/131899?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/131899?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#131898** https://app.graphite.dev/github/pr/llvm/llvm-project/131898?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#131897** https://app.graphite.dev/github/pr/llvm/llvm-project/131897?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#131896** https://app.graphite.dev/github/pr/llvm/llvm-project/131896?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#131895** https://app.graphite.dev/github/pr/llvm/llvm-project/131895?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/131899
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [ctxprof][nfc] Make `computeImportForFunction` a member of `ModuleImportsManager` (PR #134011)

2025-04-05 Thread Mircea Trofin via llvm-branch-commits

https://github.com/mtrofin ready_for_review 
https://github.com/llvm/llvm-project/pull/134011
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] llvm-reduce: Do not reduce alloca array sizes to 0 (PR #132864)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/132864

>From 8b7fcfc65d1615368805f5c3c5a459cc7e8c026a Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 25 Mar 2025 09:39:18 +0700
Subject: [PATCH] llvm-reduce: Do not reduce alloca array sizes to 0

Fixes #64340
---
 .../llvm-reduce/reduce-operands-alloca.ll | 69 +++
 .../llvm-reduce/deltas/ReduceOperands.cpp |  5 ++
 2 files changed, 74 insertions(+)
 create mode 100644 llvm/test/tools/llvm-reduce/reduce-operands-alloca.ll

diff --git a/llvm/test/tools/llvm-reduce/reduce-operands-alloca.ll 
b/llvm/test/tools/llvm-reduce/reduce-operands-alloca.ll
new file mode 100644
index 0..61c46185b3378
--- /dev/null
+++ b/llvm/test/tools/llvm-reduce/reduce-operands-alloca.ll
@@ -0,0 +1,69 @@
+; RUN: llvm-reduce --abort-on-invalid-reduction --delta-passes=operands-zero 
--test FileCheck --test-arg --check-prefix=CHECK --test-arg %s --test-arg 
--input-file %s -o %t
+; RUN: FileCheck %s --check-prefixes=CHECK,ZERO < %t
+
+; RUN: llvm-reduce --abort-on-invalid-reduction --delta-passes=operands-one 
--test FileCheck --test-arg --check-prefix=CHECK --test-arg %s --test-arg 
--input-file %s -o %t
+; RUN: FileCheck %s --check-prefixes=CHECK,ONE < %t
+
+; RUN: llvm-reduce --abort-on-invalid-reduction --delta-passes=operands-poison 
--test FileCheck --test-arg --check-prefix=CHECK --test-arg %s --test-arg 
--input-file %s -o %t
+; RUN: FileCheck %s --check-prefixes=CHECK,POISON < %t
+
+
+; CHECK-LABEL: @dyn_alloca(
+; ZERO: %alloca = alloca i32, i32 %size, align 4
+; ONE: %alloca = alloca i32, align 4
+; POISON: %alloca = alloca i32, i32 %size, align 4
+define void @dyn_alloca(i32 %size) {
+ %alloca = alloca i32, i32 %size
+ store i32 0, ptr %alloca
+ ret void
+}
+
+; CHECK-LABEL: @alloca_0_elt(
+; ZERO: %alloca = alloca i32, i32 0, align 4
+; ONE: %alloca = alloca i32, i32 0, align 4
+; POISON:  %alloca = alloca i32, i32 0, align 4
+define void @alloca_0_elt() {
+ %alloca = alloca i32, i32 0
+ store i32 0, ptr %alloca
+ ret void
+}
+
+; CHECK-LABEL: @alloca_1_elt(
+; ZERO: %alloca = alloca i32, align 4
+; ONE: %alloca = alloca i32, align 4
+; POISON: %alloca = alloca i32, align 4
+define void @alloca_1_elt() {
+ %alloca = alloca i32, i32 1
+ store i32 0, ptr %alloca
+ ret void
+}
+
+; CHECK-LABEL: @alloca_1024_elt(
+; ZERO: %alloca = alloca i32, i32 1024, align 4
+; ONE: %alloca = alloca i32, align 4
+; POISON: %alloca = alloca i32, i32 1024, align 4
+define void @alloca_1024_elt() {
+ %alloca = alloca i32, i32 1024
+ store i32 0, ptr %alloca
+ ret void
+}
+
+; CHECK-LABEL: @alloca_poison_elt(
+; ZERO: %alloca = alloca i32, i32 poison, align 4
+; ONE: %alloca = alloca i32, align 4
+; POISON: %alloca = alloca i32, i32 poison, align 4
+define void @alloca_poison_elt() {
+ %alloca = alloca i32, i32 poison
+ store i32 0, ptr %alloca
+ ret void
+}
+
+; CHECK-LABEL: @alloca_constexpr_elt(
+; ZERO: %alloca = alloca i32, i32 ptrtoint (ptr @alloca_constexpr_elt to i32)
+; ONE: %alloca = alloca i32, align 4
+; POISON: %alloca = alloca i32, i32 ptrtoint (ptr @alloca_constexpr_elt to i32)
+define void @alloca_constexpr_elt() {
+ %alloca = alloca i32, i32 ptrtoint (ptr @alloca_constexpr_elt to i32)
+ store i32 0, ptr %alloca
+ ret void
+}
diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp 
b/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp
index a4fdd9ce8033b..b0bca015434fa 100644
--- a/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp
+++ b/llvm/tools/llvm-reduce/deltas/ReduceOperands.cpp
@@ -125,6 +125,11 @@ void llvm::reduceOperandsZeroDeltaPass(Oracle &O, 
ReducerWorkItem &WorkItem) {
   auto ReduceValue = [](Use &Op) -> Value * {
 if (!shouldReduceOperand(Op))
   return nullptr;
+
+// Avoid introducing 0-sized allocations.
+if (isa(Op.getUser()))
+  return nullptr;
+
 // Don't duplicate an existing switch case.
 if (auto *IntTy = dyn_cast(Op->getType()))
   if (switchCaseExists(Op, ConstantInt::get(IntTy, 0)))

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [clang][docs] Move -Wnon-trivial-memcall to added flags. (PR #132367)

2025-04-05 Thread via llvm-branch-commits

https://github.com/R-Goc edited https://github.com/llvm/llvm-project/pull/132367
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-05 Thread Orlando Cazalet-Hyams via llvm-branch-commits

https://github.com/OCHyams ready_for_review 
https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [HEXAGON] Add support to lower "FREEZE a half(f16)" instruction on Hexagon and fix the isel-buildvector-v2f16.ll assertion (#130977) (PR #132138)

2025-04-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/132138

Backport 9c65e6ac115a

Requested by: @androm3da

>From e09e2480046740628cc28055010aef1ab05e5aff Mon Sep 17 00:00:00 2001
From: Abinaya Saravanan 
Date: Thu, 13 Mar 2025 03:28:26 +0530
Subject: [PATCH] [HEXAGON] Add support to lower "FREEZE a half(f16)"
 instruction on Hexagon and fix the isel-buildvector-v2f16.ll assertion
 (#130977)

(cherry picked from commit 9c65e6ac115a7d8566c874537791125c3ace7c1a)
---
 llvm/lib/Target/Hexagon/HexagonISelLowering.h |  1 +
 .../Target/Hexagon/HexagonISelLoweringHVX.cpp | 22 +-
 llvm/test/CodeGen/Hexagon/fp16-promote.ll | 44 +++
 3 files changed, 56 insertions(+), 11 deletions(-)
 create mode 100644 llvm/test/CodeGen/Hexagon/fp16-promote.ll

diff --git a/llvm/lib/Target/Hexagon/HexagonISelLowering.h 
b/llvm/lib/Target/Hexagon/HexagonISelLowering.h
index aaa9c65c1e07e..4df88b3a8abd7 100644
--- a/llvm/lib/Target/Hexagon/HexagonISelLowering.h
+++ b/llvm/lib/Target/Hexagon/HexagonISelLowering.h
@@ -362,6 +362,7 @@ class HexagonTargetLowering : public TargetLowering {
   shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const override {
 return AtomicExpansionKind::LLSC;
   }
+  bool softPromoteHalfType() const override { return true; }
 
 private:
   void initializeHVXLowering();
diff --git a/llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp 
b/llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp
index 1a19e81a68f08..a7eb20a3e5ff9 100644
--- a/llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp
+++ b/llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp
@@ -1618,17 +1618,6 @@ HexagonTargetLowering::LowerHvxBuildVector(SDValue Op, 
SelectionDAG &DAG)
   for (unsigned i = 0; i != Size; ++i)
 Ops.push_back(Op.getOperand(i));
 
-  // First, split the BUILD_VECTOR for vector pairs. We could generate
-  // some pairs directly (via splat), but splats should be generated
-  // by the combiner prior to getting here.
-  if (VecTy.getSizeInBits() == 16*Subtarget.getVectorLength()) {
-ArrayRef A(Ops);
-MVT SingleTy = typeSplit(VecTy).first;
-SDValue V0 = buildHvxVectorReg(A.take_front(Size/2), dl, SingleTy, DAG);
-SDValue V1 = buildHvxVectorReg(A.drop_front(Size/2), dl, SingleTy, DAG);
-return DAG.getNode(ISD::CONCAT_VECTORS, dl, VecTy, V0, V1);
-  }
-
   if (VecTy.getVectorElementType() == MVT::i1)
 return buildHvxVectorPred(Ops, dl, VecTy, DAG);
 
@@ -1645,6 +1634,17 @@ HexagonTargetLowering::LowerHvxBuildVector(SDValue Op, 
SelectionDAG &DAG)
 return DAG.getBitcast(tyVector(VecTy, MVT::f16), T0);
   }
 
+  // First, split the BUILD_VECTOR for vector pairs. We could generate
+  // some pairs directly (via splat), but splats should be generated
+  // by the combiner prior to getting here.
+  if (VecTy.getSizeInBits() == 16 * Subtarget.getVectorLength()) {
+ArrayRef A(Ops);
+MVT SingleTy = typeSplit(VecTy).first;
+SDValue V0 = buildHvxVectorReg(A.take_front(Size / 2), dl, SingleTy, DAG);
+SDValue V1 = buildHvxVectorReg(A.drop_front(Size / 2), dl, SingleTy, DAG);
+return DAG.getNode(ISD::CONCAT_VECTORS, dl, VecTy, V0, V1);
+  }
+
   return buildHvxVectorReg(Ops, dl, VecTy, DAG);
 }
 
diff --git a/llvm/test/CodeGen/Hexagon/fp16-promote.ll 
b/llvm/test/CodeGen/Hexagon/fp16-promote.ll
new file mode 100644
index 0..1ef0a133ce30a
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/fp16-promote.ll
@@ -0,0 +1,44 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -march=hexagon  < %s | FileCheck %s
+
+define half @freeze_half_undef() nounwind {
+; CHECK-LABEL: freeze_half_undef:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:{
+; CHECK-NEXT: call __truncsfhf2
+; CHECK-NEXT: r0 = #0
+; CHECK-NEXT: allocframe(#0)
+; CHECK-NEXT:}
+; CHECK-NEXT:{
+; CHECK-NEXT: call __extendhfsf2
+; CHECK-NEXT:}
+; CHECK-NEXT:{
+; CHECK-NEXT: call __truncsfhf2
+; CHECK-NEXT: r0 = sfadd(r0,r0)
+; CHECK-NEXT:}
+; CHECK-NEXT:{
+; CHECK-NEXT: r31:30 = dealloc_return(r30):raw
+; CHECK-NEXT:}
+  %y1 = freeze half undef
+  %t1 = fadd half %y1, %y1
+  ret half %t1
+}
+
+define half @freeze_half_poison(half %maybe.poison) {
+; CHECK-LABEL: freeze_half_poison:
+; CHECK:  // %bb.0:
+; CHECK:{
+; CHECK-NEXT: call __extendhfsf2
+; CHECK-NEXT: allocframe(r29,#0):raw
+; CHECK-NEXT:}
+; CHECK-NEXT:{
+; CHECK-NEXT: call __truncsfhf2
+; CHECK-NEXT: r0 = sfadd(r0,r0)
+; CHECK-NEXT:}
+; CHECK-NEXT:{
+; CHECK-NEXT: r31:30 = dealloc_return(r30):raw
+; CHECK-NEXT:}
+  %y1 = freeze half %maybe.poison
+  %t1 = fadd half %y1, %y1
+  ret half %t1
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Prevent SI_CS_CHAIN instruction from giving registers classes in generic instructions (PR #131329)

2025-04-05 Thread Diana Picus via llvm-branch-commits

rovka wrote:

Reopening this (not sure if I can change the target branch)

https://github.com/llvm/llvm-project/pull/131329
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-04-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/131899

>From efd98b412431b0c597d3d7dcee0dd4255b8e2418 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 18 Mar 2025 21:32:11 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect non-protected indirect
 calls

---
 bolt/include/bolt/Core/MCPlusBuilder.h|  10 +
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  33 +-
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  42 ++
 .../binary-analysis/AArch64/gs-pauth-calls.s  | 676 ++
 4 files changed, 757 insertions(+), 4 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-calls.s

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 76ea2489e7038..b3d54ccd5955d 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -577,6 +577,16 @@ class MCPlusBuilder {
 return getNoRegister();
   }
 
+  /// Returns the register used as call destination, or no-register, if not
+  /// an indirect call. Sets IsAuthenticatedInternally if the instruction
+  /// accepts signed pointer as its operand and authenticates it internally.
+  virtual MCPhysReg
+  getRegUsedAsCallDest(const MCInst &Inst,
+   bool &IsAuthenticatedInternally) const {
+llvm_unreachable("not implemented");
+return getNoRegister();
+  }
+
   virtual bool isTerminator(const MCInst &Inst) const;
 
   virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index c81a586b02771..b8a0a80215ce2 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -382,11 +382,11 @@ class PacRetAnalysis
 
 public:
   std::vector
-  getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF,
- const ArrayRef UsedDirtyRegs) const {
+  getLastClobberingInsts(const MCInst &Inst, BinaryFunction &BF,
+ const ArrayRef UsedDirtyRegs) {
 if (RegsToTrackInstsFor.empty())
   return {};
-auto MaybeState = getStateAt(Ret);
+auto MaybeState = getStateBefore(Inst);
 if (!MaybeState)
   llvm_unreachable("Expected State to be present");
 const State &S = *MaybeState;
@@ -434,6 +434,29 @@ static std::shared_ptr tryCheckReturn(const 
BinaryContext &BC,
   return std::make_shared(RetKind, Inst, RetReg);
 }
 
+static std::shared_ptr tryCheckCall(const BinaryContext &BC,
+const MCInstReference &Inst,
+const State &S) {
+  static const GadgetKind CallKind("non-protected call found");
+  if (!BC.MIB->isCall(Inst) && !BC.MIB->isBranch(Inst))
+return nullptr;
+
+  bool IsAuthenticated = false;
+  MCPhysReg DestReg = BC.MIB->getRegUsedAsCallDest(Inst, IsAuthenticated);
+  if (IsAuthenticated || DestReg == BC.MIB->getNoRegister())
+return nullptr;
+
+  LLVM_DEBUG({
+traceInst(BC, "Found call inst", Inst);
+traceReg(BC, "Call destination reg", DestReg);
+traceRegMask(BC, "SafeToDerefRegs", S.SafeToDerefRegs);
+  });
+  if (S.SafeToDerefRegs[DestReg])
+return nullptr;
+
+  return std::make_shared(CallKind, Inst, DestReg);
+}
+
 FunctionAnalysisResult
 Analysis::computeDfState(BinaryFunction &BF,
  MCPlusBuilder::AllocatorIdTy AllocatorId) {
@@ -450,10 +473,12 @@ Analysis::computeDfState(BinaryFunction &BF,
   for (BinaryBasicBlock &BB : BF) {
 for (int64_t I = 0, E = BB.size(); I < E; ++I) {
   MCInstReference Inst(&BB, I);
-  const State &S = *PRA.getStateAt(Inst);
+  const State &S = *PRA.getStateBefore(Inst);
 
   if (auto Report = tryCheckReturn(BC, Inst, S))
 Result.Diagnostics.push_back(Report);
+  if (auto Report = tryCheckCall(BC, Inst, S))
+Result.Diagnostics.push_back(Report);
 }
   }
 
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp 
b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index d238a1df5c7d7..9ce1514639f95 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -277,6 +277,48 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  MCPhysReg
+  getRegUsedAsCallDest(const MCInst &Inst,
+   bool &IsAuthenticatedInternally) const override {
+assert(isCall(Inst) || isBranch(Inst));
+IsAuthenticatedInternally = false;
+
+switch (Inst.getOpcode()) {
+case AArch64::B:
+case AArch64::BL:
+  assert(Inst.getOperand(0).isExpr());
+  return getNoRegister();
+case AArch64::Bcc:
+case AArch64::CBNZW:
+case AArch64::CBNZX:
+case AArch64::CBZW:
+case AArch64::CBZX:
+  assert(Inst.getOperand(1).isExpr());
+  return getNoRegister();
+case AArch64::TBNZW:
+case AArch64::TBNZX:
+case AArch64::TBZW:
+case AArch64::TBZX:
+  assert(Ins

[llvm-branch-commits] [llvm] release/20.x: [LoongArch] Move fix-tle-le-sym-type test to test/MC. NFC (#133839) (PR #134014)

2025-04-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/134014

Backport 46968310cb837e4b32859edef2107080b828b117

Requested by: @zhaoqi5

>From f198a45b6947a65774a03069d73e3f236ef94ae0 Mon Sep 17 00:00:00 2001
From: ZhaoQi 
Date: Wed, 2 Apr 2025 09:11:20 +0800
Subject: [PATCH] [LoongArch] Move fix-tle-le-sym-type test to test/MC. NFC
 (#133839)

(cherry picked from commit 46968310cb837e4b32859edef2107080b828b117)
---
 .../CodeGen/LoongArch/fix-tle-le-sym-type.ll  | 24 -
 .../Relocations/relocation-specifier.s| 26 +++
 2 files changed, 26 insertions(+), 24 deletions(-)
 delete mode 100644 llvm/test/CodeGen/LoongArch/fix-tle-le-sym-type.ll
 create mode 100644 llvm/test/MC/LoongArch/Relocations/relocation-specifier.s

diff --git a/llvm/test/CodeGen/LoongArch/fix-tle-le-sym-type.ll 
b/llvm/test/CodeGen/LoongArch/fix-tle-le-sym-type.ll
deleted file mode 100644
index d39454a51a445..0
--- a/llvm/test/CodeGen/LoongArch/fix-tle-le-sym-type.ll
+++ /dev/null
@@ -1,24 +0,0 @@
-; RUN: llc --mtriple=loongarch32 --filetype=obj %s -o %t-la32
-; RUN: llvm-readelf -s %t-la32 | FileCheck %s --check-prefix=LA32
-
-; RUN: llc --mtriple=loongarch64 --filetype=obj %s -o %t-la64
-; RUN: llvm-readelf -s %t-la64 | FileCheck %s --check-prefix=LA64
-
-; LA32:  Symbol table '.symtab' contains [[#]] entries:
-; LA32-NEXT:Num:Value  Size Type  Bind   Vis  Ndx Name
-; LA32:   0 TLS   GLOBAL DEFAULT  UND tls_sym
-
-; LA64:  Symbol table '.symtab' contains [[#]] entries:
-; LA64-NEXT:Num:Value  Size Type  Bind   Vis  Ndx Name
-; LA64:   0 TLS   GLOBAL DEFAULT  UND tls_sym
-
-@tls_sym = external thread_local(localexec) global i32
-
-define dso_local signext i32 @test_tlsle() nounwind {
-entry:
-  %0 = call ptr @llvm.threadlocal.address.p0(ptr @tls_sym)
-  %1 = load i32, ptr %0
-  ret i32 %1
-}
-
-declare nonnull ptr @llvm.threadlocal.address.p0(ptr nonnull)
diff --git a/llvm/test/MC/LoongArch/Relocations/relocation-specifier.s 
b/llvm/test/MC/LoongArch/Relocations/relocation-specifier.s
new file mode 100644
index 0..d0898aaab92fe
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Relocations/relocation-specifier.s
@@ -0,0 +1,26 @@
+# RUN: llvm-mc --filetype=obj --triple=loongarch32 %s -o %t-la32
+# RUN: llvm-readelf -rs %t-la32 | FileCheck %s --check-prefixes=CHECK,RELOC32
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 %s -o %t-la64
+# RUN: llvm-readelf -rs %t-la64 | FileCheck %s --check-prefixes=CHECK,RELOC64
+
+## This test is similar to test/MC/CSKY/relocation-specifier.s.
+
+# RELOC32: '.rela.data'
+# RELOC32: R_LARCH_32  .data + 0
+
+# RELOC64: '.rela.data'
+# RELOC64: R_LARCH_32  .data + 0
+
+# CHECK: TLS GLOBAL DEFAULT UND gd
+# CHECK: TLS GLOBAL DEFAULT UND ld
+# CHECK: TLS GLOBAL DEFAULT UND ie
+# CHECK: TLS GLOBAL DEFAULT UND le
+
+pcalau12i $t1, %gd_pc_hi20(gd)
+pcalau12i $t1, %ld_pc_hi20(ld)
+pcalau12i $t1, %ie_pc_hi20(ie)
+lu12i.w $t1, %le_hi20_r(le)
+
+.data
+local:
+.long local

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: reformulate the state for data-flow analysis (PR #131898)

2025-04-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/131898

>From 1b82382369a66bee4345489ce8fc70abf04215a7 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 17 Mar 2025 22:27:53 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: reformulate the state for
 data-flow analysis

In preparation for implementing support for detection of non-protected
call instructions, refine the definition of state which is computed for
each register by data-flow analysis.

Explicitly marking the registers which are known to be trusted at
function entry is crucial for finding non-protected calls. In addition,
it fixes less-common false negatives for pac-ret, such as `ret x1` in
`f_nonx30_ret_non_auted` test case.
---
 bolt/include/bolt/Core/MCPlusBuilder.h|  10 ++
 bolt/include/bolt/Passes/PAuthGadgetScanner.h |   7 +-
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 129 +++---
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |   4 +
 .../AArch64/gs-pacret-autiasp.s   |  19 ++-
 .../AArch64/gs-pacret-multi-bb.s  |   3 +-
 6 files changed, 104 insertions(+), 68 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index b285138b77fe7..76ea2489e7038 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -551,6 +551,16 @@ class MCPlusBuilder {
 return Analysis->isReturn(Inst);
   }
 
+  /// Returns the registers that are trusted at function entry.
+  ///
+  /// Each register should be treated as if a successfully authenticated
+  /// pointer was written to it before entering the function (i.e. the
+  /// pointer is safe to jump to as well as to be signed).
+  virtual SmallVector getTrustedLiveInRegs() const {
+llvm_unreachable("not implemented");
+return {};
+  }
+
   virtual ErrorOr getAuthenticatedReg(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
 return getNoRegister();
diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index f102f1080e2e8..404dde2901767 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -209,13 +209,12 @@ struct Report {
 
 struct GadgetReport : public Report {
   const GadgetKind &Kind;
-  SmallVector AffectedRegisters;
+  SmallVector AffectedRegisters;
   std::vector OverwritingInstrs;
 
   GadgetReport(const GadgetKind &Kind, MCInstReference Location,
-   const BitVector &AffectedRegisters)
-  : Report(Location), Kind(Kind),
-AffectedRegisters(AffectedRegisters.set_bits()) {}
+   MCPhysReg AffectedRegister)
+  : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {}
 
   void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
 
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 4f7be17327b49..c81a586b02771 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -126,18 +126,16 @@ class TrackedRegisters {
 
 // The security property that is checked is:
 // When a register is used as the address to jump to in a return instruction,
-// that register must either:
-// (a) never be changed within this function, i.e. have the same value as when
-// the function started, or
+// that register must be safe-to-dereference. It must either
+// (a) be safe-to-dereference at function entry and never be changed within 
this
+// function, i.e. have the same value as when the function started, or
 // (b) the last write to the register must be by an authentication instruction.
 
 // This property is checked by using dataflow analysis to keep track of which
-// registers have been written (def-ed), since last authenticated. Those are
-// exactly the registers containing values that should not be trusted (as they
-// could have changed since the last time they were authenticated). For 
pac-ret,
-// any return instruction using such a register is a gadget to be reported. For
-// PAuthABI, probably at least any indirect control flow using such a register
-// should be reported.
+// registers have been written (def-ed), since last authenticated. For pac-ret,
+// any return instruction using a register which is not safe-to-dereference is
+// a gadget to be reported. For PAuthABI, probably at least any indirect 
control
+// flow using such a register should be reported.
 
 // Furthermore, when producing a diagnostic for a found non-pac-ret protected
 // return, the analysis also lists the last instructions that wrote to the
@@ -156,10 +154,29 @@ class TrackedRegisters {
 //in the gadgets to be reported. This information is used in the second run
 //to also track which instructions last wrote to those registers.
 
+/// A state representing which registers are safe to use by an instruction
+/// at a given program p

[llvm-branch-commits] [clang] [HLSL][NFC] Use method builder to create default resource constructor (PR #131384)

2025-04-05 Thread Helena Kotas via llvm-branch-commits

https://github.com/hekota edited 
https://github.com/llvm/llvm-project/pull/131384
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang] resugar decltype of DeclRefExpr (PR #132447)

2025-04-05 Thread Matheus Izvekov via llvm-branch-commits

https://github.com/mizvekov updated 
https://github.com/llvm/llvm-project/pull/132447

>From 0a363317b9ac02a6a6e2f70e805223bdb135fed3 Mon Sep 17 00:00:00 2001
From: Matheus Izvekov 
Date: Fri, 14 Mar 2025 19:41:38 -0300
Subject: [PATCH] [clang] resugar decltype of DeclRefExpr

This keeps around the resugared DeclType for DeclRefExpr,
which is otherwise partially lost as the expression type
removes top level references.

This helps 'decltype' resugaring work without any loss
of information.
---
 clang/include/clang/AST/Expr.h| 32 +++--
 clang/include/clang/AST/Stmt.h|  2 +
 clang/include/clang/Sema/Sema.h   | 20 +++---
 clang/lib/AST/ASTImporter.cpp |  3 +-
 clang/lib/AST/Expr.cpp| 82 +++
 clang/lib/CodeGen/CGExpr.cpp  |  4 +-
 clang/lib/Sema/SemaChecking.cpp   |  3 +-
 clang/lib/Sema/SemaDeclCXX.cpp| 19 +++---
 clang/lib/Sema/SemaExpr.cpp   | 81 +++---
 clang/lib/Sema/SemaOpenMP.cpp | 11 ++-
 clang/lib/Sema/SemaOverload.cpp   | 25 ---
 clang/lib/Sema/SemaSYCL.cpp   |  2 +-
 clang/lib/Sema/SemaTemplate.cpp   | 13 
 clang/lib/Sema/SemaType.cpp   |  9 ++-
 clang/lib/Sema/TreeTransform.h|  5 +-
 clang/lib/Serialization/ASTReaderStmt.cpp |  8 ++-
 clang/lib/Serialization/ASTWriterStmt.cpp |  6 +-
 clang/test/Sema/Resugar/resugar-expr.cpp  |  6 +-
 18 files changed, 201 insertions(+), 130 deletions(-)

diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index 2ba787ac6df55..e92f6696027f9 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -1266,7 +1266,7 @@ class DeclRefExpr final
 : public Expr,
   private llvm::TrailingObjects {
+TemplateArgumentLoc, QualType> {
   friend class ASTStmtReader;
   friend class ASTStmtWriter;
   friend TrailingObjects;
@@ -1292,17 +1292,27 @@ class DeclRefExpr final
 return hasTemplateKWAndArgsInfo();
   }
 
+  size_t numTrailingObjects(OverloadToken) const {
+return getNumTemplateArgs();
+  }
+
+  size_t numTrailingObjects(OverloadToken) const {
+return HasResugaredDeclType();
+  }
+
   /// Test whether there is a distinct FoundDecl attached to the end of
   /// this DRE.
   bool hasFoundDecl() const { return DeclRefExprBits.HasFoundDecl; }
 
+  static bool needsDeclTypeStorage(ValueDecl *VD, QualType DeclType);
+
   DeclRefExpr(const ASTContext &Ctx, NestedNameSpecifierLoc QualifierLoc,
   SourceLocation TemplateKWLoc, ValueDecl *D,
   bool RefersToEnclosingVariableOrCapture,
   const DeclarationNameInfo &NameInfo, NamedDecl *FoundD,
   const TemplateArgumentListInfo *TemplateArgs,
   const TemplateArgumentList *ConvertedArgs, QualType T,
-  ExprValueKind VK, NonOdrUseReason NOUR);
+  ExprValueKind VK, QualType DeclType, NonOdrUseReason NOUR);
 
   /// Construct an empty declaration reference expression.
   explicit DeclRefExpr(EmptyShell Empty) : Expr(DeclRefExprClass, Empty) {}
@@ -1318,7 +1328,8 @@ class DeclRefExpr final
   Create(const ASTContext &Context, NestedNameSpecifierLoc QualifierLoc,
  SourceLocation TemplateKWLoc, ValueDecl *D,
  bool RefersToEnclosingVariableOrCapture, SourceLocation NameLoc,
- QualType T, ExprValueKind VK, NamedDecl *FoundD = nullptr,
+ QualType T, ExprValueKind VK, QualType DeclType = QualType(),
+ NamedDecl *FoundD = nullptr,
  const TemplateArgumentListInfo *TemplateArgs = nullptr,
  const TemplateArgumentList *ConvertedArgs = nullptr,
  NonOdrUseReason NOUR = NOUR_None);
@@ -1328,7 +1339,7 @@ class DeclRefExpr final
  SourceLocation TemplateKWLoc, ValueDecl *D,
  bool RefersToEnclosingVariableOrCapture,
  const DeclarationNameInfo &NameInfo, QualType T, ExprValueKind VK,
- NamedDecl *FoundD = nullptr,
+ QualType DeclType = QualType(), NamedDecl *FoundD = nullptr,
  const TemplateArgumentListInfo *TemplateArgs = nullptr,
  const TemplateArgumentList *ConvertedArgs = nullptr,
  NonOdrUseReason NOUR = NOUR_None);
@@ -1337,11 +1348,22 @@ class DeclRefExpr final
   static DeclRefExpr *CreateEmpty(const ASTContext &Context, bool HasQualifier,
   bool HasFoundDecl,
   bool HasTemplateKWAndArgsInfo,
-  unsigned NumTemplateArgs);
+  unsigned NumTemplateArgs,
+  bool HasResugaredDeclType);
 
   ValueDecl *getDecl() { return D; }
   const ValueDecl *getDecl() const { return D; }
   void setDecl(ValueDecl *NewD);
+  void recomputeDependency();
+
+  bool HasResugaredDeclType() const {
+return DeclRefExprBits.HasResugaredDeclType;
+  }
+  QualType getDeclTyp

[llvm-branch-commits] [clang] [Driver] Change linker job in Baremetal toolchain object accomodate GCCInstallation.(2/3) (PR #121830)

2025-04-05 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff e07a4cd4e0ff77f74b66695923bc998904c14746 
f4af05b47bddc3a88309341d5ff79cc9178f78ec --extensions cpp,c,h -- 
clang/lib/Driver/ToolChains/BareMetal.cpp 
clang/lib/Driver/ToolChains/BareMetal.h 
clang/test/Driver/aarch64-toolchain-extra.c 
clang/test/Driver/aarch64-toolchain.c clang/test/Driver/arm-toolchain-extra.c 
clang/test/Driver/arm-toolchain.c clang/test/Driver/sanitizer-ld.c
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/Driver/ToolChains/BareMetal.h 
b/clang/lib/Driver/ToolChains/BareMetal.h
index b4e556df11..87f173342d 100644
--- a/clang/lib/Driver/ToolChains/BareMetal.h
+++ b/clang/lib/Driver/ToolChains/BareMetal.h
@@ -36,7 +36,7 @@ protected:
   Tool *buildStaticLibTool() const override;
 
 public:
-  bool hasValidGCCInstallation() const {return GCCInstallation.isValid(); }
+  bool hasValidGCCInstallation() const { return GCCInstallation.isValid(); }
   bool isBareMetal() const override { return true; }
   bool isCrossCompiling() const override { return true; }
   bool HasNativeLLVMSupport() const override { return true; }

``




https://github.com/llvm/llvm-project/pull/121830
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for extends and trunc (PR #132383)

2025-04-05 Thread Petar Avramovic via llvm-branch-commits

https://github.com/petar-avramovic created 
https://github.com/llvm/llvm-project/pull/132383

Uniform S1:
Truncs to uniform S1 and AnyExts from S1 are left as is as they are meant
to be combined away. Uniform S1 ZExt and SExt are lowered using select.
Divergent S1:
Trunc of VGPR to VCC is lowered as compare.
Extends of VCC are lowered using select.

For remaining types:
S32 to S64 ZExt and SExt are lowered using merge values, AnyExt and Trunc
are again left as is to be combined away.
Notably uniform S16 for SExt and Zext is not lowered to S32 and left as is
for instruction select to deal with them. This is because there are patterns
that check for S16 type.

>From 2ac46e4545ecbc07d16a827a326c092a70ddc50d Mon Sep 17 00:00:00 2001
From: Petar Avramovic 
Date: Fri, 21 Mar 2025 12:41:39 +0100
Subject: [PATCH] AMDGPU/GlobalISel: add RegBankLegalize rules for extends and
 trunc

Uniform S1:
Truncs to uniform S1 and AnyExts from S1 are left as is as they are meant
to be combined away. Uniform S1 ZExt and SExt are lowered using select.
Divergent S1:
Trunc of VGPR to VCC is lowered as compare.
Extends of VCC are lowered using select.

For remaining types:
S32 to S64 ZExt and SExt are lowered using merge values, AnyExt and Trunc
are again left as is to be combined away.
Notably uniform S16 for SExt and Zext is not lowered to S32 and left as is
for instruction select to deal with them. This is because there are patterns
that check for S16 type.
---
 .../Target/AMDGPU/AMDGPURegBankLegalize.cpp   |   3 +-
 .../AMDGPU/AMDGPURegBankLegalizeHelper.cpp|  51 +++--
 .../AMDGPU/AMDGPURegBankLegalizeRules.cpp |  47 +++-
 .../AMDGPU/AMDGPURegBankLegalizeRules.h   |   3 +
 .../GlobalISel/regbankselect-anyext.mir   |  61 ++-
 .../AMDGPU/GlobalISel/regbankselect-sext.mir  | 100 --
 .../AMDGPU/GlobalISel/regbankselect-trunc.mir |  22 ++--
 .../AMDGPU/GlobalISel/regbankselect-zext.mir  |  89 ++--
 8 files changed, 262 insertions(+), 114 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp
index d5a83903e2b13..44f1b5419abb9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp
@@ -216,7 +216,8 @@ class AMDGPURegBankLegalizeCombiner {
   return;
 }
 
-if (DstTy == S32 && TruncSrcTy == S16) {
+if ((DstTy == S64 && TruncSrcTy == S32) ||
+(DstTy == S32 && TruncSrcTy == S16)) {
   B.buildAnyExt(Dst, TruncSrc);
   cleanUpAfterCombine(MI, Trunc);
   return;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
index 5dbaa9488d668..7301cba9e8ed3 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
@@ -173,13 +173,23 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI,
   case Ext32To64: {
 const RegisterBank *RB = MRI.getRegBank(MI.getOperand(0).getReg());
 MachineInstrBuilder Hi;
-
-if (MI.getOpcode() == AMDGPU::G_ZEXT) {
+switch (MI.getOpcode()) {
+case AMDGPU::G_ZEXT: {
   Hi = B.buildConstant({RB, S32}, 0);
-} else {
+  break;
+}
+case AMDGPU::G_SEXT: {
   // Replicate sign bit from 32-bit extended part.
   auto ShiftAmt = B.buildConstant({RB, S32}, 31);
   Hi = B.buildAShr({RB, S32}, MI.getOperand(1).getReg(), ShiftAmt);
+  break;
+}
+case AMDGPU::G_ANYEXT: {
+  Hi = B.buildUndef({RB, S32});
+  break;
+}
+default:
+  llvm_unreachable("Unsuported Opcode in Ext32To64");
 }
 
 B.buildMergeLikeInstr(MI.getOperand(0).getReg(),
@@ -202,7 +212,7 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI,
 // compares all bits in register.
 Register BoolSrc = MRI.createVirtualRegister({VgprRB, Ty});
 if (Ty == S64) {
-  auto Src64 = B.buildUnmerge({VgprRB, Ty}, Src);
+  auto Src64 = B.buildUnmerge(VgprRB_S32, Src);
   auto One = B.buildConstant(VgprRB_S32, 1);
   auto AndLo = B.buildAnd(VgprRB_S32, Src64.getReg(0), One);
   auto Zero = B.buildConstant(VgprRB_S32, 0);
@@ -396,8 +406,11 @@ LLT 
RegBankLegalizeHelper::getTyFromID(RegBankLLTMappingApplyID ID) {
   case Sgpr32AExt:
   case Sgpr32AExtBoolInReg:
   case Sgpr32SExt:
+  case Sgpr32ZExt:
   case UniInVgprS32:
   case Vgpr32:
+  case Vgpr32SExt:
+  case Vgpr32ZExt:
 return LLT::scalar(32);
   case Sgpr64:
   case Vgpr64:
@@ -508,6 +521,7 @@ 
RegBankLegalizeHelper::getRegBankFromID(RegBankLLTMappingApplyID ID) {
   case Sgpr32AExt:
   case Sgpr32AExtBoolInReg:
   case Sgpr32SExt:
+  case Sgpr32ZExt:
 return SgprRB;
   case Vgpr16:
   case Vgpr32:
@@ -524,6 +538,8 @@ 
RegBankLegalizeHelper::getRegBankFromID(RegBankLLTMappingApplyID ID) {
   case VgprB128:
   case VgprB256:
   case VgprB512:
+  case Vgpr32SExt:
+  case Vgpr32ZExt:
 return VgprRB;
   default:
 return nullptr;
@@ -72

[llvm-branch-commits] [clang-tools-extra] [clang-doc][NFC] Remove unnecessary directory cleanup (PR #132101)

2025-04-05 Thread Paul Kirth via llvm-branch-commits

ilovepi wrote:

### Merge activity

* **Mar 20, 5:02 PM EDT**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/132101).


https://github.com/llvm/llvm-project/pull/132101
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] llvm-reduce: Change function return types if function is not called (PR #134035)

2025-04-05 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff HEAD~1 HEAD --extensions cpp -- 
llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp
``





View the diff from clang-format here.


``diff
diff --git a/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp 
b/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp
index b4df3e6dd..72cfa8305 100644
--- a/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp
+++ b/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp
@@ -96,7 +96,6 @@ static void rewriteFuncWithReturnType(Function &OldF, Value 
*NewRetValue) {
   // result of our pruning here.
   EliminateUnreachableBlocks(OldF);
 
-
   // Drop the incompatible attributes before we copy over to the new function.
   if (OldRetTy != NewRetTy) {
 AttributeList AL = OldF.getAttributes();

``




https://github.com/llvm/llvm-project/pull/134035
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: reformulate the state for data-flow analysis (PR #131898)

2025-04-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/131898

>From 1b82382369a66bee4345489ce8fc70abf04215a7 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 17 Mar 2025 22:27:53 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: reformulate the state for
 data-flow analysis

In preparation for implementing support for detection of non-protected
call instructions, refine the definition of state which is computed for
each register by data-flow analysis.

Explicitly marking the registers which are known to be trusted at
function entry is crucial for finding non-protected calls. In addition,
it fixes less-common false negatives for pac-ret, such as `ret x1` in
`f_nonx30_ret_non_auted` test case.
---
 bolt/include/bolt/Core/MCPlusBuilder.h|  10 ++
 bolt/include/bolt/Passes/PAuthGadgetScanner.h |   7 +-
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 129 +++---
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |   4 +
 .../AArch64/gs-pacret-autiasp.s   |  19 ++-
 .../AArch64/gs-pacret-multi-bb.s  |   3 +-
 6 files changed, 104 insertions(+), 68 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index b285138b77fe7..76ea2489e7038 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -551,6 +551,16 @@ class MCPlusBuilder {
 return Analysis->isReturn(Inst);
   }
 
+  /// Returns the registers that are trusted at function entry.
+  ///
+  /// Each register should be treated as if a successfully authenticated
+  /// pointer was written to it before entering the function (i.e. the
+  /// pointer is safe to jump to as well as to be signed).
+  virtual SmallVector getTrustedLiveInRegs() const {
+llvm_unreachable("not implemented");
+return {};
+  }
+
   virtual ErrorOr getAuthenticatedReg(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
 return getNoRegister();
diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index f102f1080e2e8..404dde2901767 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -209,13 +209,12 @@ struct Report {
 
 struct GadgetReport : public Report {
   const GadgetKind &Kind;
-  SmallVector AffectedRegisters;
+  SmallVector AffectedRegisters;
   std::vector OverwritingInstrs;
 
   GadgetReport(const GadgetKind &Kind, MCInstReference Location,
-   const BitVector &AffectedRegisters)
-  : Report(Location), Kind(Kind),
-AffectedRegisters(AffectedRegisters.set_bits()) {}
+   MCPhysReg AffectedRegister)
+  : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {}
 
   void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
 
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 4f7be17327b49..c81a586b02771 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -126,18 +126,16 @@ class TrackedRegisters {
 
 // The security property that is checked is:
 // When a register is used as the address to jump to in a return instruction,
-// that register must either:
-// (a) never be changed within this function, i.e. have the same value as when
-// the function started, or
+// that register must be safe-to-dereference. It must either
+// (a) be safe-to-dereference at function entry and never be changed within 
this
+// function, i.e. have the same value as when the function started, or
 // (b) the last write to the register must be by an authentication instruction.
 
 // This property is checked by using dataflow analysis to keep track of which
-// registers have been written (def-ed), since last authenticated. Those are
-// exactly the registers containing values that should not be trusted (as they
-// could have changed since the last time they were authenticated). For 
pac-ret,
-// any return instruction using such a register is a gadget to be reported. For
-// PAuthABI, probably at least any indirect control flow using such a register
-// should be reported.
+// registers have been written (def-ed), since last authenticated. For pac-ret,
+// any return instruction using a register which is not safe-to-dereference is
+// a gadget to be reported. For PAuthABI, probably at least any indirect 
control
+// flow using such a register should be reported.
 
 // Furthermore, when producing a diagnostic for a found non-pac-ret protected
 // return, the analysis also lists the last instructions that wrote to the
@@ -156,10 +154,29 @@ class TrackedRegisters {
 //in the gadgets to be reported. This information is used in the second run
 //to also track which instructions last wrote to those registers.
 
+/// A state representing which registers are safe to use by an instruction
+/// at a given program p

[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)

2025-04-05 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj created 
https://github.com/llvm/llvm-project/pull/133667

In the profitability check for vectorization, the dependency matrix was not 
handled correctly. This can result to make a wrong decision: It may say "this 
loop can be vectorized" when in fact it cannot. The root cause of this is that 
the check process early returns when it finds '=' or 'I' in the dependency 
matrix. To make sure that we can actually vectorize the loop, we need to check 
all the rows of the matrix. This patch fixes the process of checking whether we 
can vectorize the loop or not. Now it won't make a wrong decision for a loop 
that cannot be vectorized.

Related: #131130

>From 2db59e8629d3640ec070eb906ac55a5e970176d1 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Thu, 27 Mar 2025 09:52:16 +
Subject: [PATCH] [LoopInterchange] Fix the vectorizable check for a loop

In the profitability check for vectorization, the dependency matrix was
not handled correctly. This can result to make a wrong decision: It may
say "this loop can be vectorized" when in fact it cannot. The root cause
of this is that the check process early returns when it finds '=' or 'I'
in the dependency matrix. To make sure that we can actually vectorize
the loop, we need to check all the rows of the matrix. This patch fixes
the process of checking whether we can vectorize the loop or not. Now it
won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130
---
 .../lib/Transforms/Scalar/LoopInterchange.cpp | 41 +++
 .../profitability-vectorization-heuristic.ll  |  9 ++--
 2 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp 
b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index e777f950a7c5a..b6b0b7d7a947a 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -1197,25 +1197,32 @@ 
LoopInterchangeProfitability::isProfitablePerInstrOrderCost() {
   return std::nullopt;
 }
 
+/// Return true if we can vectorize the loop specified by \p LoopId.
+static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) {
+  for (unsigned I = 0; I != DepMatrix.size(); I++) {
+char Dir = DepMatrix[I][LoopId];
+if (Dir != 'I' && Dir != '=')
+  return false;
+  }
+  return true;
+}
+
 std::optional LoopInterchangeProfitability::isProfitableForVectorization(
 unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) {
-  for (auto &Row : DepMatrix) {
-// If the inner loop is loop independent or doesn't carry any dependency
-// it is not profitable to move this to outer position, since we are
-// likely able to do inner loop vectorization already.
-if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=')
-  return std::optional(false);
-
-// If the outer loop is not loop independent it is not profitable to move
-// this to inner position, since doing so would not enable inner loop
-// parallelism.
-if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=')
-  return std::optional(false);
-  }
-  // If inner loop has dependence and outer loop is loop independent then it
-  // is/ profitable to interchange to enable inner loop parallelism.
-  // If there are no dependences, interchanging will not improve anything.
-  return std::optional(!DepMatrix.empty());
+  // If the outer loop is not loop independent it is not profitable to move
+  // this to inner position, since doing so would not enable inner loop
+  // parallelism.
+  if (!canVectorize(DepMatrix, OuterLoopId))
+return false;
+
+  // If inner loop has dependence and outer loop is loop independent then it is
+  // profitable to interchange to enable inner loop parallelism.
+  if (!canVectorize(DepMatrix, InnerLoopId))
+return true;
+
+  // TODO: Estimate the cost of vectorized loop body when both the outer and 
the
+  // inner loop can be vectorized.
+  return std::nullopt;
 }
 
 bool LoopInterchangeProfitability::isProfitable(
diff --git 
a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
index 606117e70db86..b82dd5141a6b2 100644
--- 
a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
+++ 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
@@ -15,16 +15,13 @@
 ;   }
 ; }
 ;
-; FIXME: These loops are not exchanged at this time due to the problem of
-; profitablity heuristic for vectorization.
 
-; CHECK:  --- !Missed
+; CHECK:  --- !Passed
 ; CHECK-NEXT: Pass:loop-interchange
-; CHECK-NEXT: Name:InterchangeNotProfitable
+; CHECK-NEXT: Name:Interchanged
 ; CHECK-NEXT: Function:interchange_necesasry_for_vectorization
 ; CHECK-NEXT: Args:
-; CHECK-NEXT:   - String:  Interchanging loops is not considered to 
improve cache locality nor vectori

[llvm-branch-commits] [llvm] llvm-reduce: Fix losing call metadata in operands-to-args (PR #133422)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/133422

>From 1c18bf5fe4ccec532eaef3677e40e976dd2d460c Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 28 Mar 2025 18:01:39 +0700
Subject: [PATCH] llvm-reduce: Fix using call metadata in operands-to-args

---
 .../tools/llvm-reduce/operands-to-args-preserve-fmf.ll | 7 +--
 llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp | 2 ++
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll 
b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll
index b4b19ca28dbb5..fc31a08353b8f 100644
--- a/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll
+++ b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll
@@ -12,9 +12,12 @@ define float @callee(float %a) {
 ; INTERESTING: load float
 
 ; REDUCED-LABEL: define float @caller(ptr %ptr, float %val, float 
%callee.ret1) {
-; REDUCED: %callee.ret12 = call nnan nsz float @callee(float %val, float 
0.00e+00)
+; REDUCED: %callee.ret12 = call nnan nsz float @callee(float %val, float 
0.00e+00), !fpmath !0
 define float @caller(ptr %ptr) {
   %val = load float, ptr %ptr
-  %callee.ret = call nnan nsz float @callee(float %val)
+  %callee.ret = call nnan nsz float @callee(float %val), !fpmath !0
   ret float %callee.ret
 }
+
+; REDUCED: !0 = !{float 2.00e+00}
+!0 = !{float 2.0}
diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp 
b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp
index e7ad52eb65a5d..33f6463be6581 100644
--- a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp
+++ b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp
@@ -111,6 +111,8 @@ static void replaceFunctionCalls(Function *OldF, Function 
*NewF) {
 if (auto *FPOp = dyn_cast(NewCI))
   NewCI->setFastMathFlags(CI->getFastMathFlags());
 
+NewCI->copyMetadata(*CI);
+
 // Do the replacement for this use.
 if (!CI->use_empty())
   CI->replaceAllUsesWith(NewCI);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Map simple `do concurrent` loops to OpenMP host constructs (PR #127633)

2025-04-05 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/127633

>From f6a61dc9d383f19fa1cf38173829f2a732a4d544 Mon Sep 17 00:00:00 2001
From: ergawy 
Date: Tue, 18 Feb 2025 02:50:46 -0600
Subject: [PATCH 1/3] [flang][OpenMP] Map simple `do concurrent` loops to
 OpenMP host constructs

Upstreams one more part of the ROCm `do concurrent` to OpenMP mapping
pass. This PR add support for converting simple loops to the equivalent
OpenMP constructs on the host: `omp parallel do`. Towards that end, we
have to collect more information about loop nests for which we add new
utils in the `looputils` name space.
---
 flang/docs/DoConcurrentConversionToOpenMP.md  |  47 
 .../OpenMP/DoConcurrentConversion.cpp | 211 +-
 .../Transforms/DoConcurrent/basic_host.f90|  14 +-
 .../Transforms/DoConcurrent/basic_host.mlir   |  62 +
 .../DoConcurrent/non_const_bounds.f90 |  45 
 .../DoConcurrent/not_perfectly_nested.f90 |  45 
 6 files changed, 405 insertions(+), 19 deletions(-)
 create mode 100644 flang/test/Transforms/DoConcurrent/basic_host.mlir
 create mode 100644 flang/test/Transforms/DoConcurrent/non_const_bounds.f90
 create mode 100644 flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90

diff --git a/flang/docs/DoConcurrentConversionToOpenMP.md 
b/flang/docs/DoConcurrentConversionToOpenMP.md
index 7b49af742f242..19611615ee9d6 100644
--- a/flang/docs/DoConcurrentConversionToOpenMP.md
+++ b/flang/docs/DoConcurrentConversionToOpenMP.md
@@ -126,6 +126,53 @@ see the "Data environment" section below.
 See `flang/test/Transforms/DoConcurrent/loop_nest_test.f90` for more examples
 of what is and is not detected as a perfect loop nest.
 
+### Single-range loops
+
+Given the following loop:
+```fortran
+  do concurrent(i=1:n)
+a(i) = i * i
+  end do
+```
+
+ Mapping to `host`
+
+Mapping this loop to the `host`, generates MLIR operations of the following
+structure:
+
+```
+%4 = fir.address_of(@_QFEa) ...
+%6:2 = hlfir.declare %4 ...
+
+omp.parallel {
+  // Allocate private copy for `i`.
+  // TODO Use delayed privatization.
+  %19 = fir.alloca i32 {bindc_name = "i"}
+  %20:2 = hlfir.declare %19 {uniq_name = "_QFEi"} ...
+
+  omp.wsloop {
+omp.loop_nest (%arg0) : index = (%21) to (%22) inclusive step (%c1_2) {
+  %23 = fir.convert %arg0 : (index) -> i32
+  // Use the privatized version of `i`.
+  fir.store %23 to %20#1 : !fir.ref
+  ...
+
+  // Use "shared" SSA value of `a`.
+  %42 = hlfir.designate %6#0
+  hlfir.assign %35 to %42
+  ...
+  omp.yield
+}
+omp.terminator
+  }
+  omp.terminator
+}
+```
+
+ Mapping to `device`
+
+
+
 

[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)

2025-04-05 Thread Fabian Ritter via llvm-branch-commits

ritter-x2a wrote:

> Maybe we could consider adding "ISD::PTRADD"? Lowers to ISD::ADD by default, 
> but targets that want to do weird things with pointer arithmetic could do 
> them.

 That would be helpful. We'd still need an inbounds flag for ISD::PTRADD, but 
it would certainly be easier to make use of. I'll look into that.

> One other concern, which applies to basically any formulation of this: Since 
> SelectionDAG doesn't have a distinct pointer type, you can't tell whether the 
> pointer operand was produced by an inttoptr. So in some cases, you have an 
> operation marked "inbounds", but it's ambiguous which object it's actually 
> inbounds to. This isn't really a problem at the moment because we do IR-level 
> transforms that remove inttoptr anyway, but if we ever do resolve the 
> IR-level issues, we should have some idea for how we propagate the fix to 
> SelectionDAG.

I can see that's a problem if you'd want to infer that an operation is 
inbounds, or if you'd want to prove the absence (or presence) of poison/UB. But 
how is that a problem for generating code? If there is an inbounds flag on a 
(hypothetical) ISD::PTRADD, we can assume that the operation is inbounds with 
respect to whatever the address operand is pointing to, no matter if it's the 
result of integer operations, right?



https://github.com/llvm/llvm-project/pull/131862
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 7f18a2f - Revert "[X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VP…"

2025-04-05 Thread via llvm-branch-commits

Author: Simon Pilgrim
Date: 2025-04-03T16:00:07+01:00
New Revision: 7f18a2fa9567050a245f3992963752a74cdff884

URL: 
https://github.com/llvm/llvm-project/commit/7f18a2fa9567050a245f3992963752a74cdff884
DIFF: 
https://github.com/llvm/llvm-project/commit/7f18a2fa9567050a245f3992963752a74cdff884.diff

LOG: Revert "[X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of 
VP…"

This reverts commit bf516098fb7c7d428cae03296b92766467f76c9e.

Added: 


Modified: 
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll
llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast_from_memory.ll
llvm/test/CodeGen/X86/shuffle-vs-trunc-128.ll
llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-5.ll
llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-5.ll
llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-6.ll
llvm/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll
llvm/test/CodeGen/X86/zero_extend_vector_inreg_of_broadcast.ll
llvm/test/CodeGen/X86/zero_extend_vector_inreg_of_broadcast_from_memory.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index d1be19539b642..546a2d22fa58e 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -43827,69 +43827,6 @@ bool 
X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode(
   }
   break;
 }
-case X86ISD::VPERMV: {
-  SmallVector Mask;
-  SmallVector Ops;
-  if ((VT.is256BitVector() || Subtarget.hasVLX()) &&
-  getTargetShuffleMask(Op, /*AllowSentinelZero=*/false, Ops, Mask)) {
-// For lane-crossing shuffles, only split in half in case we're still
-// referencing higher elements.
-unsigned HalfElts = NumElts / 2;
-unsigned HalfSize = SizeInBits / 2;
-Mask.resize(HalfElts);
-if (all_of(Mask,
-   [&](int M) { return isUndefOrInRange(M, 0, HalfElts); })) {
-  MVT HalfVT = VT.getSimpleVT().getHalfNumVectorElementsVT();
-  SDLoc DL(Op);
-  SDValue Ext;
-  SDValue M =
-  extractSubVector(Op.getOperand(0), 0, TLO.DAG, DL, HalfSize);
-  SDValue V =
-  extractSubVector(Op.getOperand(1), 0, TLO.DAG, DL, HalfSize);
-  // For 128-bit v2X64/v4X32 instructions, use VPERMILPD/VPERMILPS.
-  if (VT.is512BitVector() || VT.getScalarSizeInBits() <= 16)
-Ext = TLO.DAG.getNode(Opc, DL, HalfVT, M, V);
-  else
-Ext = TLO.DAG.getNode(X86ISD::VPERMILPV, DL, HalfVT, V, M);
-  SDValue Insert = widenSubVector(Ext, /*ZeroNewElements=*/false,
-  Subtarget, TLO.DAG, DL, SizeInBits);
-  return TLO.CombineTo(Op, Insert);
-}
-  }
-  break;
-}
-case X86ISD::VPERMV3: {
-  SmallVector Mask;
-  SmallVector Ops;
-  if (Subtarget.hasVLX() &&
-  getTargetShuffleMask(Op, /*AllowSentinelZero=*/false, Ops, Mask)) {
-// For lane-crossing shuffles, only split in half in case we're still
-// referencing higher elements.
-unsigned HalfElts = NumElts / 2;
-unsigned HalfSize = SizeInBits / 2;
-Mask.resize(HalfElts);
-if (all_of(Mask, [&](int M) {
-  return isUndefOrInRange(M, 0, HalfElts) ||
- isUndefOrInRange(M, NumElts, NumElts + HalfElts);
-})) {
-  // Adjust mask elements for 2nd operand to point to half width.
-  for (int &M : Mask)
-M = M <= NumElts ? M : (M - HalfElts);
-  MVT HalfVT = VT.getSimpleVT().getHalfNumVectorElementsVT();
-  MVT HalfIntVT = HalfVT.changeVectorElementTypeToInteger();
-  SDLoc DL(Op);
-  SDValue Ext = TLO.DAG.getNode(
-  Opc, DL, HalfVT,
-  extractSubVector(Op.getOperand(0), 0, TLO.DAG, DL, HalfSize),
-  getConstVector(Mask, HalfIntVT, TLO.DAG, DL, /*IsMask=*/true),
-  extractSubVector(Op.getOperand(2), 0, TLO.DAG, DL, HalfSize));
-  SDValue Insert = widenSubVector(Ext, /*ZeroNewElements=*/false,
-  Subtarget, TLO.DAG, DL, SizeInBits);
-  return TLO.CombineTo(Op, Insert);
-}
-  }
-  break;
-}
 case X86ISD::VPERM2X128: {
   // Simplify VPERM2F128/VPERM2I128 to extract_subvector.
   SDLoc DL(Op);

diff  --git a/llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll 
b/llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll
index b075d48627b18..6f4e7abda8b00 100644
--- a/llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll
+++ b/llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll
@@ -749,10 +749,10 @@ define void 
@vec128_i16_widen_to_i32_factor2_broadcast_to_v4i

[llvm-branch-commits] [llvm] [CodeGen][StaticDataSplitter]Support constant pool partitioning (PR #129781)

2025-04-05 Thread Mingming Liu via llvm-branch-commits


@@ -2769,6 +2769,23 @@ namespace {
 
 } // end anonymous namespace
 
+StringRef AsmPrinter::getConstantSectionSuffix(const Constant *C) const {
+  SmallString<8> SectionNameSuffix;
+  if (TM.Options.EnableStaticDataPartitioning) {
+if (C && SDPI && PSI) {
+  auto Count = SDPI->getConstantProfileCount(C);
+  if (Count) {
+if (PSI->isHotCount(*Count)) {
+  SectionNameSuffix.append("hot");
+} else if (PSI->isColdCount(*Count) && !SDPI->hasUnknownCount(C)) {
+  SectionNameSuffix.append("unlikely");
+}
+  }
+}
+  }
+  return SectionNameSuffix.str();

mingmingl-llvm wrote:

thanks for the catch! done.

https://github.com/llvm/llvm-project/pull/129781
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: analyze functions without CFG information (PR #133461)

2025-04-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/133461

>From 43c034a7ccd057eb4e1c29daaa5f3ff882ae685a Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Wed, 19 Mar 2025 18:58:32 +0300
Subject: [PATCH 1/4] [BOLT] Gadget scanner: analyze functions without CFG
 information

Support simple analysis of the functions for which BOLT is unable to
reconstruct the CFG. This patch is inspired by the approach implemented
by Kristof Beyls in the original prototype of gadget scanner, but a
CFG-unaware counterpart of the data-flow analysis is implemented
instead of separate version of gadget detector, as multiple gadget kinds
are detected now.
---
 bolt/include/bolt/Core/BinaryFunction.h   |  13 +
 bolt/include/bolt/Passes/PAuthGadgetScanner.h |  24 +
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 266 +---
 .../AArch64/gs-pacret-autiasp.s   |  15 +
 .../binary-analysis/AArch64/gs-pauth-calls.s  | 594 ++
 5 files changed, 835 insertions(+), 77 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryFunction.h 
b/bolt/include/bolt/Core/BinaryFunction.h
index d3d11f8c5fb73..5cb2cc95af695 100644
--- a/bolt/include/bolt/Core/BinaryFunction.h
+++ b/bolt/include/bolt/Core/BinaryFunction.h
@@ -799,6 +799,19 @@ class BinaryFunction {
 return iterator_range(cie_begin(), cie_end());
   }
 
+  /// Iterate over instructions (only if CFG is unavailable or not built yet).
+  iterator_range instrs() {
+assert(!hasCFG() && "Iterate over basic blocks instead");
+return make_range(Instructions.begin(), Instructions.end());
+  }
+  iterator_range instrs() const {
+assert(!hasCFG() && "Iterate over basic blocks instead");
+return make_range(Instructions.begin(), Instructions.end());
+  }
+
+  /// Returns whether there are any labels at Offset.
+  bool hasLabelAt(unsigned Offset) const { return Labels.count(Offset) != 0; }
+
   /// Iterate over all jump tables associated with this function.
   iterator_range::const_iterator>
   jumpTables() const {
diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index 622e6721dea55..aa44f8c565639 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -67,6 +67,14 @@ struct MCInstInBFReference {
   uint64_t Offset;
   MCInstInBFReference(BinaryFunction *BF, uint64_t Offset)
   : BF(BF), Offset(Offset) {}
+
+  static MCInstInBFReference get(const MCInst *Inst, BinaryFunction &BF) {
+for (auto &I : BF.instrs())
+  if (Inst == &I.second)
+return MCInstInBFReference(&BF, I.first);
+return {};
+  }
+
   MCInstInBFReference() : BF(nullptr), Offset(0) {}
   bool operator==(const MCInstInBFReference &RHS) const {
 return BF == RHS.BF && Offset == RHS.Offset;
@@ -106,6 +114,12 @@ struct MCInstReference {
   MCInstReference(BinaryFunction *BF, uint32_t Offset)
   : MCInstReference(MCInstInBFReference(BF, Offset)) {}
 
+  static MCInstReference get(const MCInst *Inst, BinaryFunction &BF) {
+if (BF.hasCFG())
+  return MCInstInBBReference::get(Inst, BF);
+return MCInstInBFReference::get(Inst, BF);
+  }
+
   bool operator<(const MCInstReference &RHS) const {
 if (ParentKind != RHS.ParentKind)
   return ParentKind < RHS.ParentKind;
@@ -140,6 +154,16 @@ struct MCInstReference {
 llvm_unreachable("");
   }
 
+  operator bool() const {
+switch (ParentKind) {
+case BasicBlockParent:
+  return U.BBRef.BB != nullptr;
+case FunctionParent:
+  return U.BFRef.BF != nullptr;
+}
+llvm_unreachable("");
+  }
+
   uint64_t getAddress() const {
 switch (ParentKind) {
 case BasicBlockParent:
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index df9e87bd4e999..f5d224675d749 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -124,6 +124,27 @@ class TrackedRegisters {
   }
 };
 
+// Without CFG, we reset gadget scanning state when encountering an
+// unconditional branch. Note that BC.MIB->isUnconditionalBranch neither
+// considers indirect branches nor annotated tail calls as unconditional.
+static bool isStateTrackingBoundary(const BinaryContext &BC,
+const MCInst &Inst) {
+  const MCInstrDesc &Desc = BC.MII->get(Inst.getOpcode());
+  // Adapted from llvm::MCInstrDesc::isUnconditionalBranch().
+  return Desc.isBranch() && Desc.isBarrier();
+}
+
+template  static void iterateOverInstrs(BinaryFunction &BF, T Fn) {
+  if (BF.hasCFG()) {
+for (BinaryBasicBlock &BB : BF)
+  for (int64_t I = 0, E = BB.size(); I < E; ++I)
+Fn(MCInstInBBReference(&BB, I));
+  } else {
+for (auto I : BF.instrs())
+  Fn(MCInstInBFReference(&BF, I.first));
+  }
+}
+
 // The security property that is checked is:
 // When a register is used as the address to jump to in a return instruction,
 // that register mus

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for bit shifts and sext-inreg (PR #132385)

2025-04-05 Thread Petar Avramovic via llvm-branch-commits

https://github.com/petar-avramovic created 
https://github.com/llvm/llvm-project/pull/132385

Uniform S16 shifts have to be extended to S32 using appropriate Extend
before lowering to S32 instruction.
Uniform packed V2S16 are lowered to SGPR S32 instructions,
other option is to use VALU packed V2S16 and ReadAnyLane.
For uniform S32 and S64 and divergent S16, S32, S64 and V2S16 there are
instructions available.

>From 9a0eaa14fddc00648a09f2880cc16207dfa4e1de Mon Sep 17 00:00:00 2001
From: Petar Avramovic 
Date: Fri, 21 Mar 2025 13:12:11 +0100
Subject: [PATCH] AMDGPU/GlobalISel: add RegBankLegalize rules for bit shifts
 and sext-inreg

Uniform S16 shifts have to be extended to S32 using appropriate Extend
before lowering to S32 instruction.
Uniform packed V2S16 are lowered to SGPR S32 instructions,
other option is to use VALU packed V2S16 and ReadAnyLane.
For uniform S32 and S64 and divergent S16, S32, S64 and V2S16 there are
instructions available.
---
 .../Target/AMDGPU/AMDGPURegBankLegalize.cpp   |   3 +-
 .../AMDGPU/AMDGPURegBankLegalizeHelper.cpp| 104 ++
 .../AMDGPU/AMDGPURegBankLegalizeHelper.h  |   5 +
 .../AMDGPU/AMDGPURegBankLegalizeRules.cpp |  45 -
 .../AMDGPU/AMDGPURegBankLegalizeRules.h   |  11 ++
 llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll   |  10 +-
 llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll   | 187 +-
 .../AMDGPU/GlobalISel/regbankselect-ashr.mir  |   6 +-
 .../AMDGPU/GlobalISel/regbankselect-lshr.mir  |  17 +-
 .../GlobalISel/regbankselect-sext-inreg.mir   |  24 +--
 .../AMDGPU/GlobalISel/regbankselect-shl.mir   |   6 +-
 .../CodeGen/AMDGPU/GlobalISel/sext_inreg.ll   |  34 ++--
 llvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll|  10 +-
 13 files changed, 311 insertions(+), 151 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp
index 44f1b5419abb9..4fd776bec9492 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp
@@ -23,6 +23,7 @@
 #include "GCNSubtarget.h"
 #include "llvm/CodeGen/GlobalISel/CSEInfo.h"
 #include "llvm/CodeGen/GlobalISel/CSEMIRBuilder.h"
+#include "llvm/CodeGen/GlobalISel/Utils.h"
 #include "llvm/CodeGen/MachineFunctionPass.h"
 #include "llvm/CodeGen/MachineUniformityAnalysis.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
@@ -306,7 +307,7 @@ bool 
AMDGPURegBankLegalize::runOnMachineFunction(MachineFunction &MF) {
 // Opcodes that support pretty much all combinations of reg banks and LLTs
 // (except S1). There is no point in writing rules for them.
 if (Opc == AMDGPU::G_BUILD_VECTOR || Opc == AMDGPU::G_UNMERGE_VALUES ||
-Opc == AMDGPU::G_MERGE_VALUES) {
+Opc == AMDGPU::G_MERGE_VALUES || Opc == G_BITCAST) {
   RBLHelper.applyMappingTrivial(*MI);
   continue;
 }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
index 0f5f3545ac8eb..59f16315bbd72 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
@@ -14,13 +14,16 @@
 #include "AMDGPURegBankLegalizeHelper.h"
 #include "AMDGPUGlobalISelUtils.h"
 #include "AMDGPUInstrInfo.h"
+#include "AMDGPURegBankLegalizeRules.h"
 #include "AMDGPURegisterBankInfo.h"
 #include "GCNSubtarget.h"
 #include "MCTargetDesc/AMDGPUMCTargetDesc.h"
 #include "llvm/CodeGen/GlobalISel/GenericMachineInstrs.h"
+#include "llvm/CodeGen/GlobalISel/MIPatternMatch.h"
 #include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"
 #include "llvm/CodeGen/MachineUniformityAnalysis.h"
 #include "llvm/IR/IntrinsicsAMDGPU.h"
+#include "llvm/Support/ErrorHandling.h"
 
 #define DEBUG_TYPE "amdgpu-regbanklegalize"
 
@@ -130,6 +133,28 @@ void RegBankLegalizeHelper::widenLoad(MachineInstr &MI, 
LLT WideTy,
   MI.eraseFromParent();
 }
 
+std::pair RegBankLegalizeHelper::unpackZExt(Register Reg) {
+  auto PackedS32 = B.buildBitcast(SgprRB_S32, Reg);
+  auto Mask = B.buildConstant(SgprRB_S32, 0x);
+  auto Lo = B.buildAnd(SgprRB_S32, PackedS32, Mask);
+  auto Hi = B.buildLShr(SgprRB_S32, PackedS32, B.buildConstant(SgprRB_S32, 
16));
+  return {Lo.getReg(0), Hi.getReg(0)};
+}
+
+std::pair RegBankLegalizeHelper::unpackSExt(Register Reg) {
+  auto PackedS32 = B.buildBitcast(SgprRB_S32, Reg);
+  auto Lo = B.buildSExtInReg(SgprRB_S32, PackedS32, 16);
+  auto Hi = B.buildAShr(SgprRB_S32, PackedS32, B.buildConstant(SgprRB_S32, 
16));
+  return {Lo.getReg(0), Hi.getReg(0)};
+}
+
+std::pair RegBankLegalizeHelper::unpackAExt(Register Reg) {
+  auto PackedS32 = B.buildBitcast(SgprRB_S32, Reg);
+  auto Lo = PackedS32;
+  auto Hi = B.buildLShr(SgprRB_S32, PackedS32, B.buildConstant(SgprRB_S32, 
16));
+  return {Lo.getReg(0), Hi.getReg(0)};
+}
+
 void RegBankLegalizeHelper::lower(MachineInstr &MI,
   const RegBankLLTMapping &Mapping,
   SmallSet &WaterfallSgp

[llvm-branch-commits] [llvm] InlineFunction: Split inlining into predicate and apply functions (PR #134213)

2025-04-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Matt Arsenault (arsenm)


Changes

This is to support a new inline function reduction in llvm-reduce,
which should pre-filter callsites that are not eligible for inlining.

This code was mostly structured as a match and apply, with a few
exceptions. The ugliest piece is for propagating and verifying compatible
getGC and personalities. Also collection of EHPad and the convergence token
to use are now cached in InlineFunctionInfo.

I was initially confused by the split between the checks performed here
and isInlineViable, so better document how this system is supposed to work.
It turns out this split does make sense, in that isInlineViable checks
if it's possible based on the callee content and the ultimate inline
depended on the callsite context. I think more renames of these functions
would help, and isInlineViable should probably move out of InlineCost to be
with these transfoms.

---
Full diff: https://github.com/llvm/llvm-project/pull/134213.diff


3 Files Affected:

- (modified) llvm/include/llvm/Analysis/InlineCost.h (+5-1) 
- (modified) llvm/include/llvm/Transforms/Utils/Cloning.h (+28) 
- (modified) llvm/lib/Transforms/Utils/InlineFunction.cpp (+81-42) 


``diff
diff --git a/llvm/include/llvm/Analysis/InlineCost.h 
b/llvm/include/llvm/Analysis/InlineCost.h
index 90ee75773957a..ec59d54954e16 100644
--- a/llvm/include/llvm/Analysis/InlineCost.h
+++ b/llvm/include/llvm/Analysis/InlineCost.h
@@ -334,7 +334,11 @@ std::optional getInliningCostFeatures(
 ProfileSummaryInfo *PSI = nullptr,
 OptimizationRemarkEmitter *ORE = nullptr);
 
-/// Minimal filter to detect invalid constructs for inlining.
+/// Check if it is mechanically possible to inline the function \p Callee, 
based
+/// on the contents of the function.
+///
+/// See also \p CanInlineCallSite as an additional precondition necessary to
+/// perform a valid inline in a particular use context.
 InlineResult isInlineViable(Function &Callee);
 
 // This pass is used to annotate instructions during the inline process for
diff --git a/llvm/include/llvm/Transforms/Utils/Cloning.h 
b/llvm/include/llvm/Transforms/Utils/Cloning.h
index ec1a1d5faa7e9..201e6ba2b491f 100644
--- a/llvm/include/llvm/Transforms/Utils/Cloning.h
+++ b/llvm/include/llvm/Transforms/Utils/Cloning.h
@@ -263,6 +263,9 @@ class InlineFunctionInfo {
   /// `InlinedCalls` above is used.
   SmallVector InlinedCallSites;
 
+  Value *ConvergenceControlToken = nullptr;
+  Instruction *CallSiteEHPad = nullptr;
+
   /// Update profile for callee as well as cloned version. We need to do this
   /// for regular inlining, but not for inlining from sample profile loader.
   bool UpdateProfile;
@@ -271,9 +274,34 @@ class InlineFunctionInfo {
 StaticAllocas.clear();
 InlinedCalls.clear();
 InlinedCallSites.clear();
+ConvergenceControlToken = nullptr;
+CallSiteEHPad = nullptr;
   }
 };
 
+/// Check if it is legal to perform inlining of the function called by \p CB
+/// into the caller at this particular use, and sets fields in \p IFI.
+///
+/// This does not consider whether it is possible for the function callee 
itself
+/// to be inlined; for that see isInlineViable.
+InlineResult CanInlineCallSite(const CallBase &CB, InlineFunctionInfo &IFI);
+
+/// This should generally not be used, use InlineFunction instead.
+///
+/// Perform mechanical inlining of \p CB into the caller.
+///
+/// This does not perform any legality or profitability checks for the
+/// inlining. This assumes that CanInlineCallSite was already called, populated
+/// \p IFI, and returned InlineResult::success.
+///
+/// Also assumes that isInlineViable returned InlineResult::success for the
+/// called function.
+void InlineFunctionImpl(CallBase &CB, InlineFunctionInfo &IFI,
+bool MergeAttributes = false,
+AAResults *CalleeAAR = nullptr,
+bool InsertLifetime = true,
+Function *ForwardVarArgsTo = nullptr);
+
 /// This function inlines the called function into the basic
 /// block of the caller.  This returns false if it is not possible to inline
 /// this call.  The program is still in a well defined state if this occurs
diff --git a/llvm/lib/Transforms/Utils/InlineFunction.cpp 
b/llvm/lib/Transforms/Utils/InlineFunction.cpp
index 131fbe654c11c..7236cc0131eb9 100644
--- a/llvm/lib/Transforms/Utils/InlineFunction.cpp
+++ b/llvm/lib/Transforms/Utils/InlineFunction.cpp
@@ -2446,19 +2446,8 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, 
InlineFunctionInfo &IFI,
   return Ret;
 }
 
-/// This function inlines the called function into the basic block of the
-/// caller. This returns false if it is not possible to inline this call.
-/// The program is still in a well defined state if this occurs though.
-///
-/// Note that this only does one level of inlining.  For example, if the
-/// instruction 'call B' is inlined, and 'B' calls 'C', 

[llvm-branch-commits] [llvm] llvm-reduce: Change function return types if function is not called (PR #134035)

2025-04-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-ir

Author: Matt Arsenault (arsenm)


Changes

Extend the early return on value reduction to mutate the function return
type if the function has no call uses. This could be generalized to rewrite
cases where all callsites are used, but it turns out that complicates the
visitation order given we try to compute all opportunities up front.

This is enough to cleanup the common case where we end up with one
function with a return of an uninteresting constant.

---
Full diff: https://github.com/llvm/llvm-project/pull/134035.diff


2 Files Affected:

- (added) 
llvm/test/tools/llvm-reduce/reduce-values-to-return-new-return-type.ll (+95) 
- (modified) llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp (+4-3) 


``diff
diff --git 
a/llvm/test/tools/llvm-reduce/reduce-values-to-return-new-return-type.ll 
b/llvm/test/tools/llvm-reduce/reduce-values-to-return-new-return-type.ll
new file mode 100644
index 0..9ddbbe3def44f
--- /dev/null
+++ b/llvm/test/tools/llvm-reduce/reduce-values-to-return-new-return-type.ll
@@ -0,0 +1,95 @@
+; Test that llvm-reduce can move intermediate values by inserting
+; early returns when the function already has a different return type
+;
+; RUN: llvm-reduce --abort-on-invalid-reduction 
--delta-passes=instructions-to-return --test FileCheck --test-arg 
--check-prefix=INTERESTING --test-arg %s --test-arg --input-file %s -o %t
+; RUN: FileCheck --check-prefix=RESULT %s < %t
+
+
+@gv = global i32 0, align 4
+@ptr_array = global [2 x ptr] [ptr 
@inst_to_return_has_different_type_but_no_func_call_use,
+   ptr @multiple_callsites_wrong_return_type]
+
+; Should rewrite this return from i64 to i32 since the function has no
+; uses.
+; INTERESTING-LABEL: @inst_to_return_has_different_type_but_no_func_call_use(
+; RESULT-LABEL: define i32 
@inst_to_return_has_different_type_but_no_func_call_use(ptr %arg) {
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define i64 @inst_to_return_has_different_type_but_no_func_call_use(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret i64 0
+}
+
+; INTERESTING-LABEL: @callsite_different_type_unused_0(
+; RESULT-LABEL: define i64 
@inst_to_return_has_different_type_but_call_result_unused(
+; RESULT-NEXT: %load = load i32, ptr %arg
+; RESULT-NEXT: store i32 %load, ptr @gv
+; RESULT-NEXT: ret i64 0
+define void @callsite_different_type_unused_0(ptr %arg) {
+  %unused0 = call i64 
@inst_to_return_has_different_type_but_call_result_unused(ptr %arg)
+  %unused1 = call i64 
@inst_to_return_has_different_type_but_call_result_unused(ptr null)
+  ret void
+}
+
+; TODO: Could rewrite this return from i64 to i32 since the callsite is unused.
+; INTERESTING-LABEL: @inst_to_return_has_different_type_but_call_result_unused(
+; RESULT-LABEL: define i64 
@inst_to_return_has_different_type_but_call_result_unused(
+; RESULT: ret i64 0
+define i64 @inst_to_return_has_different_type_but_call_result_unused(ptr %arg) 
{
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret i64 0
+}
+
+; INTERESTING-LABEL: @multiple_callsites_wrong_return_type(
+; RESULT-LABEL: define i64 @multiple_callsites_wrong_return_type(
+; RESULT: ret i64 0
+define i64 @multiple_callsites_wrong_return_type(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret i64 0
+}
+
+; INTERESTING-LABEL: @unused_with_wrong_return_types(
+; RESULT-LABEL: define i64 @unused_with_wrong_return_types(
+; RESULT-NEXT: %unused0 = call i64 @multiple_callsites_wrong_return_type(ptr 
%arg)
+; RESULT-NEXT: ret i64 %unused0
+define void @unused_with_wrong_return_types(ptr %arg) {
+  %unused0 = call i64 @multiple_callsites_wrong_return_type(ptr %arg)
+  %unused1 = call i32 @multiple_callsites_wrong_return_type(ptr %arg)
+  %unused2 = call ptr @multiple_callsites_wrong_return_type(ptr %arg)
+  ret void
+}
+
+; INTERESTING-LABEL: @multiple_returns_wrong_return_type(
+; INTERESTING: %load0 = load i32,
+
+; RESULT-LABEL: define i32 @multiple_returns_wrong_return_type(
+; RESULT: ret i32
+; RESULT: ret i32
+; RESULT: ret i32
+define i32 @multiple_returns_wrong_return_type(ptr %arg, i1 %cond, i32 %arg2) {
+entry:
+  br i1 %cond, label %bb0, label %bb1
+
+bb0:
+  %load0 = load i32, ptr %arg
+  store i32 %load0, ptr @gv
+  ret i32 234
+
+bb1:
+  ret i32 %arg2
+
+bb2:
+  ret i32 34
+}
+
+; INTERESTING-LABEL: @call_multiple_returns_wrong_return_type(
+; RESULT-LABEL: define <2 x i32> @call_multiple_returns_wrong_return_type(
+; RESULT-NEXT: %unused = call <2 x i32> @multiple_returns_wrong_return_type(
+; RESULT-NEXT: ret <2 x i32> %unused
+define void @call_multiple_returns_wrong_return_type(ptr %arg, i1 %cond, i32 
%arg2) {
+  %unused = call <2 x i32> @multiple_returns_wrong_return_type(ptr %arg, i1 
%cond, i32 %arg2)
+  ret void
+}
diff --git a/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp 
b/llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp
ind

[llvm-branch-commits] [flang] [flang][OpenMP] Handle "loop-local values" in `do concurrent` nests (PR #127635)

2025-04-05 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/127635

>From 6321731e6e1cf412ed002571b9140d56ac5b76c6 Mon Sep 17 00:00:00 2001
From: ergawy 
Date: Tue, 18 Feb 2025 06:40:19 -0600
Subject: [PATCH] [flang][OpenMP] Handle "loop-local values" in `do concurrent`
 nests

Extends `do concurrent` mapping to handle "loop-local values". A loop-local
value is one that is used exclusively inside the loop but allocated outside
of it. This usually corresponds to temporary values that are used inside the
loop body for initialzing other variables for example. After collecting these
values, the pass localizes them to the loop nest by moving their allocations.
---
 flang/docs/DoConcurrentConversionToOpenMP.md  | 51 ++
 .../OpenMP/DoConcurrentConversion.cpp | 68 ++-
 .../DoConcurrent/locally_destroyed_temp.f90   | 62 +
 3 files changed, 180 insertions(+), 1 deletion(-)
 create mode 100644 
flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90

diff --git a/flang/docs/DoConcurrentConversionToOpenMP.md 
b/flang/docs/DoConcurrentConversionToOpenMP.md
index ecb4428d7d3ba..76c54f5bbf587 100644
--- a/flang/docs/DoConcurrentConversionToOpenMP.md
+++ b/flang/docs/DoConcurrentConversionToOpenMP.md
@@ -202,6 +202,57 @@ variables: `i` and `j`. These are locally allocated inside 
the parallel/target
 OpenMP region similar to what the single-range example in previous section
 shows.
 
+### Data environment
+
+By default, variables that are used inside a `do concurrent` loop nest are
+either treated as `shared` in case of mapping to `host`, or mapped into the
+`target` region using a `map` clause in case of mapping to `device`. The only
+exceptions to this are:
+  1. the loop's iteration variable(s) (IV) of **perfect** loop nests. In that
+ case, for each IV, we allocate a local copy as shown by the mapping
+ examples above.
+  1. any values that are from allocations outside the loop nest and used
+ exclusively inside of it. In such cases, a local privatized
+ copy is created in the OpenMP region to prevent multiple teams of threads
+ from accessing and destroying the same memory block, which causes runtime
+ issues. For an example of such cases, see
+ `flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90`.
+
+Implicit mapping detection (for mapping to the target device) is still quite
+limited and work to make it smarter is underway for both OpenMP in general 
+and `do concurrent` mapping.
+
+ Non-perfectly-nested loops' IVs
+
+For non-perfectly-nested loops, the IVs are still treated as `shared` or
+`map` entries as pointed out above. This **might not** be consistent with what
+the Fortran specification tells us. In particular, taking the following
+snippets from the spec (version 2023) into account:
+
+> § 3.35
+> --
+> construct entity
+> entity whose identifier has the scope of a construct
+
+> § 19.4
+> --
+>  A variable that appears as an index-name in a FORALL or DO CONCURRENT
+>  construct [...] is a construct entity. A variable that has LOCAL or
+>  LOCAL_INIT locality in a DO CONCURRENT construct is a construct entity.
+> [...]
+> The name of a variable that appears as an index-name in a DO CONCURRENT
+> construct, FORALL statement, or FORALL construct has a scope of the statement
+> or construct. A variable that has LOCAL or LOCAL_INIT locality in a DO
+> CONCURRENT construct has the scope of that construct.
+
+From the above quotes, it seems there is an equivalence between the IV of a `do
+concurrent` loop and a variable with a `LOCAL` locality specifier (equivalent
+to OpenMP's `private` clause). Which means that we should probably
+localize/privatize a `do concurrent` loop's IV even if it is not perfectly
+nested in the nest we are parallelizing. For now, however, we **do not** do
+that as pointed out previously. In the near future, we propose a middle-ground
+solution (see the Next steps section for more details).
+
 

[llvm-branch-commits] [compiler-rt] [compiler-rt][Darwin][x86] Fix instrprof-darwin-exports test (#131425) (PR #132500)

2025-04-05 Thread John Hui via llvm-branch-commits

https://github.com/j-hui created 
https://github.com/llvm/llvm-project/pull/132500

ld64 issues a warning about section alignment which was counted as an 
unexpected exported symbol and the test failed.

Fixed by disabling all linker warnings using -Wl,-w.

cherry-picked from commit 94426df66a8d7c2321f9e197e5ef9636b0d5ce70

>From b03be06b732890f7e9fb445d9d71aec33408ea90 Mon Sep 17 00:00:00 2001
From: David Tellenbach 
Date: Mon, 17 Mar 2025 17:23:58 -0700
Subject: [PATCH] [compiler-rt][Darwin][x86] Fix instrprof-darwin-exports test
 (#131425)

ld64 issues a warning about section alignment which was counted as an
unexpected exported symbol and the test failed.

Fixed by disabling all linker warnings using -Wl,-w.
---
 compiler-rt/test/profile/instrprof-darwin-exports.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/compiler-rt/test/profile/instrprof-darwin-exports.c 
b/compiler-rt/test/profile/instrprof-darwin-exports.c
index 079d5d28ed24d..1a2ac8c813272 100644
--- a/compiler-rt/test/profile/instrprof-darwin-exports.c
+++ b/compiler-rt/test/profile/instrprof-darwin-exports.c
@@ -7,13 +7,13 @@
 // just "_main" produces no warnings or errors.
 //
 // RUN: echo "_main" > %t.exports
-// RUN: %clang_pgogen -Werror -Wl,-exported_symbols_list,%t.exports -o %t %s 
2>&1 | tee %t.log
-// RUN: %clang_profgen -Werror -fcoverage-mapping 
-Wl,-exported_symbols_list,%t.exports -o %t %s 2>&1 | tee -a %t.log
+// RUN: %clang_pgogen -Werror -Wl,-exported_symbols_list,%t.exports -Wl,-w -o 
%t %s 2>&1 | tee %t.log
+// RUN: %clang_profgen -Werror -fcoverage-mapping 
-Wl,-exported_symbols_list,%t.exports -Wl,-w -o %t %s 2>&1 | tee -a %t.log
 // RUN: cat %t.log | count 0
 
 // 2) Ditto (1), but for GCOV.
 //
-// RUN: %clang -Werror -Wl,-exported_symbols_list,%t.exports --coverage -o 
%t.gcov %s | tee -a %t.gcov.log
+// RUN: %clang -Werror -Wl,-exported_symbols_list,%t.exports -Wl,-w --coverage 
-o %t.gcov %s | tee -a %t.gcov.log
 // RUN: cat %t.gcov.log | count 0
 
 // 3) The default set of weak external symbols should match the set of symbols

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] llvm-reduce: Fix losing call metadata in operands-to-args (PR #133422)

2025-04-05 Thread Shilei Tian via llvm-branch-commits

https://github.com/shiltian approved this pull request.


https://github.com/llvm/llvm-project/pull/133422
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-04-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/131899

>From 317a2d79f2b810be89f11fcf7afaa6f92c245e61 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 18 Mar 2025 21:32:11 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect non-protected indirect
 calls

---
 bolt/include/bolt/Core/MCPlusBuilder.h|  10 +
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  33 +-
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  42 ++
 .../binary-analysis/AArch64/gs-pauth-calls.s  | 676 ++
 4 files changed, 757 insertions(+), 4 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-calls.s

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 76ea2489e7038..b3d54ccd5955d 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -577,6 +577,16 @@ class MCPlusBuilder {
 return getNoRegister();
   }
 
+  /// Returns the register used as call destination, or no-register, if not
+  /// an indirect call. Sets IsAuthenticatedInternally if the instruction
+  /// accepts signed pointer as its operand and authenticates it internally.
+  virtual MCPhysReg
+  getRegUsedAsCallDest(const MCInst &Inst,
+   bool &IsAuthenticatedInternally) const {
+llvm_unreachable("not implemented");
+return getNoRegister();
+  }
+
   virtual bool isTerminator(const MCInst &Inst) const;
 
   virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 93a452b224233..5b3bfb487d33b 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -382,11 +382,11 @@ class PacRetAnalysis
 
 public:
   std::vector
-  getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF,
- const ArrayRef UsedDirtyRegs) const {
+  getLastClobberingInsts(const MCInst &Inst, BinaryFunction &BF,
+ const ArrayRef UsedDirtyRegs) {
 if (RegsToTrackInstsFor.empty())
   return {};
-auto MaybeState = getStateAt(Ret);
+auto MaybeState = getStateBefore(Inst);
 if (!MaybeState)
   llvm_unreachable("Expected State to be present");
 const State &S = *MaybeState;
@@ -434,6 +434,29 @@ static std::shared_ptr tryCheckReturn(const 
BinaryContext &BC,
   return std::make_shared(RetKind, Inst, RetReg);
 }
 
+static std::shared_ptr tryCheckCall(const BinaryContext &BC,
+const MCInstReference &Inst,
+const State &S) {
+  static const GadgetKind CallKind("non-protected call found");
+  if (!BC.MIB->isCall(Inst) && !BC.MIB->isBranch(Inst))
+return nullptr;
+
+  bool IsAuthenticated = false;
+  MCPhysReg DestReg = BC.MIB->getRegUsedAsCallDest(Inst, IsAuthenticated);
+  if (IsAuthenticated || DestReg == BC.MIB->getNoRegister())
+return nullptr;
+
+  LLVM_DEBUG({
+traceInst(BC, "Found call inst", Inst);
+traceReg(BC, "Call destination reg", DestReg);
+traceRegMask(BC, "SafeToDerefRegs", S.SafeToDerefRegs);
+  });
+  if (S.SafeToDerefRegs[DestReg])
+return nullptr;
+
+  return std::make_shared(CallKind, Inst, DestReg);
+}
+
 FunctionAnalysisResult
 Analysis::computeDfState(BinaryFunction &BF,
  MCPlusBuilder::AllocatorIdTy AllocatorId) {
@@ -450,10 +473,12 @@ Analysis::computeDfState(BinaryFunction &BF,
   for (BinaryBasicBlock &BB : BF) {
 for (int64_t I = 0, E = BB.size(); I < E; ++I) {
   MCInstReference Inst(&BB, I);
-  const State &S = *PRA.getStateAt(Inst);
+  const State &S = *PRA.getStateBefore(Inst);
 
   if (auto Report = tryCheckReturn(BC, Inst, S))
 Result.Diagnostics.push_back(Report);
+  if (auto Report = tryCheckCall(BC, Inst, S))
+Result.Diagnostics.push_back(Report);
 }
   }
 
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp 
b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index d238a1df5c7d7..9ce1514639f95 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -277,6 +277,48 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  MCPhysReg
+  getRegUsedAsCallDest(const MCInst &Inst,
+   bool &IsAuthenticatedInternally) const override {
+assert(isCall(Inst) || isBranch(Inst));
+IsAuthenticatedInternally = false;
+
+switch (Inst.getOpcode()) {
+case AArch64::B:
+case AArch64::BL:
+  assert(Inst.getOperand(0).isExpr());
+  return getNoRegister();
+case AArch64::Bcc:
+case AArch64::CBNZW:
+case AArch64::CBNZX:
+case AArch64::CBZW:
+case AArch64::CBZX:
+  assert(Inst.getOperand(1).isExpr());
+  return getNoRegister();
+case AArch64::TBNZW:
+case AArch64::TBNZX:
+case AArch64::TBZW:
+case AArch64::TBZX:
+  assert(Ins

[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Add infastructure to parse parameters (PR #133800)

2025-04-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Finn Plummer (inbelic)


Changes

- defines `ParamType` as a way to represent a reference to some
parameter in a root signature

- defines `ParseParam` and `ParseParams` as an infastructure to define
how the parameters of a given struct should be parsed in an orderless
manner

- implements parsing of two param types: `UInt32` and `Register` to
demonstrate the parsing implementation and allow for unit testing

Part two of implementing: https://github.com/llvm/llvm-project/issues/126569

---
Full diff: https://github.com/llvm/llvm-project/pull/133800.diff


5 Files Affected:

- (modified) clang/include/clang/Basic/DiagnosticParseKinds.td (+4-1) 
- (modified) clang/include/clang/Parse/ParseHLSLRootSignature.h (+40) 
- (modified) clang/lib/Parse/ParseHLSLRootSignature.cpp (+151-14) 
- (modified) clang/unittests/Parse/ParseHLSLRootSignatureTest.cpp (+142-4) 
- (modified) llvm/include/llvm/Frontend/HLSL/HLSLRootSignature.h (+15) 


``diff
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td 
b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 2582e1e5ef0f6..ab12159ba5ae1 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1830,8 +1830,11 @@ def err_hlsl_virtual_function
 def err_hlsl_virtual_inheritance
 : Error<"virtual inheritance is unsupported in HLSL">;
 
-// HLSL Root Siganture diagnostic messages
+// HLSL Root Signature Parser Diagnostics
 def err_hlsl_unexpected_end_of_params
 : Error<"expected %0 to denote end of parameters, or, another valid 
parameter of %1">;
+def err_hlsl_rootsig_repeat_param : Error<"specified the same parameter '%0' 
multiple times">;
+def err_hlsl_rootsig_missing_param : Error<"did not specify mandatory 
parameter '%0'">;
+def err_hlsl_number_literal_overflow : Error<"integer literal is too large to 
be represented as a 32-bit %select{signed |}0 integer type">;
 
 } // end of Parser diagnostics
diff --git a/clang/include/clang/Parse/ParseHLSLRootSignature.h 
b/clang/include/clang/Parse/ParseHLSLRootSignature.h
index 43b41315b88b5..02e99e83875db 100644
--- a/clang/include/clang/Parse/ParseHLSLRootSignature.h
+++ b/clang/include/clang/Parse/ParseHLSLRootSignature.h
@@ -69,6 +69,46 @@ class RootSignatureParser {
   bool parseDescriptorTable();
   bool parseDescriptorTableClause();
 
+  /// Each unique ParamType will have a custom parse method defined that we can
+  /// use to invoke the parameters.
+  ///
+  /// This function will switch on the ParamType using std::visit and dispatch
+  /// onto the corresponding parse method
+  bool parseParam(llvm::hlsl::rootsig::ParamType Ref);
+
+  /// Parameter arguments (eg. `bReg`, `space`, ...) can be specified in any
+  /// order, exactly once, and only a subset are mandatory. This function acts
+  /// as the infastructure to do so in a declarative way.
+  ///
+  /// For the example:
+  ///  SmallDenseMap Params = {
+  ///TokenKind::bReg, &Clause.Register,
+  ///TokenKind::kw_space, &Clause.Space
+  ///  };
+  ///  SmallDenseSet Mandatory = {
+  ///TokenKind::kw_numDescriptors
+  ///  };
+  ///
+  /// We can read it is as:
+  ///
+  /// when 'b0' is encountered, invoke the parse method for the type
+  ///   of &Clause.Register (Register *) and update the parameter
+  /// when 'space' is encountered, invoke a parse method for the type
+  ///   of &Clause.Space (uint32_t *) and update the parameter
+  ///
+  /// and 'bReg' must be specified
+  bool parseParams(
+  llvm::SmallDenseMap &Params,
+  llvm::SmallDenseSet &Mandatory);
+
+  /// Parameter parse methods corresponding to a ParamType
+  bool parseUIntParam(uint32_t *X);
+  bool parseRegister(llvm::hlsl::rootsig::Register *Reg);
+
+  /// Use NumericLiteralParser to convert CurToken.NumSpelling into a unsigned
+  /// 32-bit integer
+  bool handleUIntLiteral(uint32_t *X);
+
   /// Invoke the Lexer to consume a token and update CurToken with the result
   void consumeNextToken() { CurToken = Lexer.ConsumeToken(); }
 
diff --git a/clang/lib/Parse/ParseHLSLRootSignature.cpp 
b/clang/lib/Parse/ParseHLSLRootSignature.cpp
index 33caca5fa1c82..62d29baea49d3 100644
--- a/clang/lib/Parse/ParseHLSLRootSignature.cpp
+++ b/clang/lib/Parse/ParseHLSLRootSignature.cpp
@@ -8,6 +8,8 @@
 
 #include "clang/Parse/ParseHLSLRootSignature.h"
 
+#include "clang/Lex/LiteralSupport.h"
+
 #include "llvm/Support/raw_ostream.h"
 
 using namespace llvm::hlsl::rootsig;
@@ -39,12 +41,11 @@ bool RootSignatureParser::parse() {
   break;
   }
 
-  if (!tryConsumeExpectedToken(TokenKind::end_of_stream)) {
-getDiags().Report(CurToken.TokLoc, diag::err_hlsl_unexpected_end_of_params)
-<< /*expected=*/TokenKind::end_of_stream
-<< /*param of=*/TokenKind::kw_RootSignature;
+  if (consumeExpectedToken(TokenKind::end_of_stream,
+   diag::err_hlsl_unexpected_end_of_params,
+   /*param of=

[llvm-branch-commits] [llvm] [DAG][AArch64] Handle truncated buildvectors to allow and(subvector(anyext)) fold. (PR #133915)

2025-04-05 Thread David Green via llvm-branch-commits

https://github.com/davemgreen updated 
https://github.com/llvm/llvm-project/pull/133915

>From 35f44f31a41e485c7098a66bff99c4dfc424bb8d Mon Sep 17 00:00:00 2001
From: David Green 
Date: Tue, 1 Apr 2025 15:15:08 +0100
Subject: [PATCH 1/2] [DAG][AArch64] Handle truncated buildvectors to allow
 and(subvector(anyext)) fold.

This fold was not handling the extended BUILDVECTORs that we see when i8/i16
are not legal types. Using isConstOrConstSplat(N1, false, true) allows it to
match truncated constants. The other changes are to make sure that truncated
values in N1C are treated correctly, the fold we are mostly interested in is
```
  if (N0.getOpcode() == ISD::EXTRACT_SUBVECTOR && N0.hasOneUse() && N1C &&
  ISD::isExtOpcode(N0.getOperand(0).getOpcode())) {
```
---
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |  6 +-
 .../aarch64-neon-vector-insert-uaddlv.ll  | 12 +--
 llvm/test/CodeGen/AArch64/bitcast-extend.ll   |  4 +-
 llvm/test/CodeGen/AArch64/ctlz.ll |  3 +-
 llvm/test/CodeGen/AArch64/ctpop.ll|  3 +-
 llvm/test/CodeGen/AArch64/itofp.ll| 90 +++
 .../AArch64/vec3-loads-ext-trunc-stores.ll| 23 ++---
 llvm/test/CodeGen/AArch64/vector-fcvt.ll  | 36 +++-
 8 files changed, 63 insertions(+), 114 deletions(-)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index dc5c5f38e3bd8..4bb52e9075297 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -7166,7 +7166,8 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
 
   // if (and x, c) is known to be zero, return 0
   unsigned BitWidth = VT.getScalarSizeInBits();
-  ConstantSDNode *N1C = isConstOrConstSplat(N1);
+  ConstantSDNode *N1C =
+  isConstOrConstSplat(N1, /*AllowUndef*/ false, /*AllowTrunc*/ true);
   if (N1C && DAG.MaskedValueIsZero(SDValue(N, 0), APInt::getAllOnes(BitWidth)))
 return DAG.getConstant(0, DL, VT);
 
@@ -7205,7 +7206,8 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
   return DAG.getNode(ISD::ZERO_EXTEND, DL, VT, N0Op0);
 
 // fold (and (any_ext V), c) -> (zero_ext (and (trunc V), c)) if 
profitable.
-if (N1C->getAPIntValue().countLeadingZeros() >= (BitWidth - SrcBitWidth) &&
+APInt N1APInt = N1C->getAPIntValue().trunc(VT.getScalarSizeInBits());
+if (N1APInt.countLeadingZeros() >= (BitWidth - SrcBitWidth) &&
 TLI.isTruncateFree(VT, SrcVT) && TLI.isZExtFree(SrcVT, VT) &&
 TLI.isTypeDesirableForOp(ISD::AND, SrcVT) &&
 TLI.isNarrowingProfitable(N, VT, SrcVT))
diff --git a/llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll 
b/llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll
index 412f39f8adc1b..f37767291ca14 100644
--- a/llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll
+++ b/llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll
@@ -282,8 +282,7 @@ define void @insert_vec_v16i8_uaddlv_from_v8i8(ptr %0) {
 ; CHECK-NEXT:uaddlv.8b h1, v0
 ; CHECK-NEXT:stp q0, q0, [x0, #32]
 ; CHECK-NEXT:mov.b v2[0], v1[0]
-; CHECK-NEXT:zip1.8b v2, v2, v2
-; CHECK-NEXT:bic.4h v2, #255, lsl #8
+; CHECK-NEXT:ushll.8h v2, v2, #0
 ; CHECK-NEXT:ushll.4s v2, v2, #0
 ; CHECK-NEXT:ucvtf.4s v2, v2
 ; CHECK-NEXT:stp q2, q0, [x0]
@@ -305,8 +304,7 @@ define void @insert_vec_v8i8_uaddlv_from_v8i8(ptr %0) {
 ; CHECK-NEXT:stp xzr, xzr, [x0, #16]
 ; CHECK-NEXT:uaddlv.8b h1, v0
 ; CHECK-NEXT:mov.b v0[0], v1[0]
-; CHECK-NEXT:zip1.8b v0, v0, v0
-; CHECK-NEXT:bic.4h v0, #255, lsl #8
+; CHECK-NEXT:ushll.8h v0, v0, #0
 ; CHECK-NEXT:ushll.4s v0, v0, #0
 ; CHECK-NEXT:ucvtf.4s v0, v0
 ; CHECK-NEXT:str q0, [x0]
@@ -436,8 +434,7 @@ define void @insert_vec_v8i8_uaddlv_from_v4i32(ptr %0) {
 ; CHECK-NEXT:stp xzr, xzr, [x0, #16]
 ; CHECK-NEXT:uaddlv.4s d0, v0
 ; CHECK-NEXT:mov.b v1[0], v0[0]
-; CHECK-NEXT:zip1.8b v1, v1, v1
-; CHECK-NEXT:bic.4h v1, #255, lsl #8
+; CHECK-NEXT:ushll.8h v1, v1, #0
 ; CHECK-NEXT:ushll.4s v1, v1, #0
 ; CHECK-NEXT:ucvtf.4s v1, v1
 ; CHECK-NEXT:str q1, [x0]
@@ -461,8 +458,7 @@ define void @insert_vec_v16i8_uaddlv_from_v4i32(ptr %0) {
 ; CHECK-NEXT:uaddlv.4s d0, v0
 ; CHECK-NEXT:stp q2, q2, [x0, #32]
 ; CHECK-NEXT:mov.b v1[0], v0[0]
-; CHECK-NEXT:zip1.8b v1, v1, v1
-; CHECK-NEXT:bic.4h v1, #255, lsl #8
+; CHECK-NEXT:ushll.8h v1, v1, #0
 ; CHECK-NEXT:ushll.4s v1, v1, #0
 ; CHECK-NEXT:ucvtf.4s v1, v1
 ; CHECK-NEXT:stp q1, q2, [x0]
diff --git a/llvm/test/CodeGen/AArch64/bitcast-extend.ll 
b/llvm/test/CodeGen/AArch64/bitcast-extend.ll
index 5dc335900a798..08a7493d0ba7f 100644
--- a/llvm/test/CodeGen/AArch64/bitcast-extend.ll
+++ b/llvm/test/CodeGen/AArch64/bitcast-extend.ll
@@ -6,8 +6,8 @@ define <4 x i16> @z_i32_v4i16(i32 %x) {
 ; CHECK-SD-LABEL: z_i32_v4i16:
 ; CHECK-SD:   // %bb.0:
 ; CHECK-SD-NEXT:fmov s0, w0
-; CHECK-SD-NEXT:zip1 v0.8b, v0

[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)

2025-04-05 Thread Kai Nacke via llvm-branch-commits


@@ -0,0 +1,73 @@
+; RUN: llc <%s --mtriple s390x-ibm-zos --filetype=obj -o - | \
+; RUN:   od -Ax -tx1 -v | FileCheck --ignore-case %s
+; REQUIRES: systemz-registered-target

redstar wrote:

Removed.

https://github.com/llvm/llvm-project/pull/133799
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Use OmpDirectiveSpecification in standalone directives (PR #131163)

2025-04-05 Thread Mats Petersson via llvm-branch-commits

https://github.com/Leporacanthicus approved this pull request.

LGTM, thanks for the work!

https://github.com/llvm/llvm-project/pull/131163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang-tools-extra] [clang] support pack expansions for trailing requires clauses (PR #133190)

2025-04-05 Thread Matheus Izvekov via llvm-branch-commits

https://github.com/mizvekov updated 
https://github.com/llvm/llvm-project/pull/133190

>From 65a4c47a81e9e294f5d3c8f1afbe1f9036ac8e4b Mon Sep 17 00:00:00 2001
From: Matheus Izvekov 
Date: Wed, 26 Mar 2025 18:38:34 -0300
Subject: [PATCH] [clang] support pack expansions for trailing requires clauses

This fixes a crash when evaluating constraints from trailing
requires clauses, when these are part of a generic lambda which
is expanded.
---
 .../refactor/tweaks/ExtractVariable.cpp   |  6 +--
 clang/docs/ReleaseNotes.rst   |  2 +
 clang/include/clang/AST/ASTNodeTraverser.h|  4 +-
 clang/include/clang/AST/Decl.h| 35 +++--
 clang/include/clang/AST/DeclCXX.h | 20 
 clang/include/clang/AST/ExprCXX.h |  2 +-
 clang/include/clang/AST/RecursiveASTVisitor.h |  9 ++--
 clang/include/clang/Sema/Sema.h   | 14 ++---
 clang/lib/AST/ASTContext.cpp  |  7 ++-
 clang/lib/AST/ASTImporter.cpp |  5 +-
 clang/lib/AST/Decl.cpp| 16 +++---
 clang/lib/AST/DeclCXX.cpp | 33 +++-
 clang/lib/AST/DeclPrinter.cpp | 10 ++--
 clang/lib/AST/DeclTemplate.cpp|  4 +-
 clang/lib/AST/ExprCXX.cpp |  2 +-
 clang/lib/AST/ItaniumMangle.cpp   |  4 +-
 clang/lib/ASTMatchers/ASTMatchFinder.cpp  |  3 +-
 clang/lib/Index/IndexDecl.cpp |  4 +-
 clang/lib/Sema/SemaConcept.cpp|  6 +--
 clang/lib/Sema/SemaDecl.cpp   | 22 
 clang/lib/Sema/SemaDeclCXX.cpp|  4 +-
 clang/lib/Sema/SemaFunctionEffects.cpp|  2 +-
 clang/lib/Sema/SemaLambda.cpp | 18 ---
 clang/lib/Sema/SemaOverload.cpp   | 12 +++--
 clang/lib/Sema/SemaTemplateDeductionGuide.cpp | 51 ---
 .../lib/Sema/SemaTemplateInstantiateDecl.cpp  |  4 +-
 clang/lib/Sema/TreeTransform.h|  7 ++-
 clang/lib/Serialization/ASTReaderDecl.cpp |  3 +-
 clang/lib/Serialization/ASTWriterDecl.cpp |  5 +-
 .../SemaCXX/fold_lambda_with_variadics.cpp|  9 
 clang/tools/libclang/CIndex.cpp   |  2 +-
 31 files changed, 191 insertions(+), 134 deletions(-)

diff --git a/clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp 
b/clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
index d84e501b87ce7..90dac3b76c648 100644
--- a/clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
+++ b/clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
@@ -100,9 +100,9 @@ computeReferencedDecls(const clang::Expr *Expr) {
 TraverseLambdaCapture(LExpr, &Capture, Initializer);
   }
 
-  if (clang::Expr *const RequiresClause =
-  LExpr->getTrailingRequiresClause()) {
-TraverseStmt(RequiresClause);
+  if (const clang::Expr *RequiresClause =
+  LExpr->getTrailingRequiresClause().ConstraintExpr) {
+TraverseStmt(const_cast(RequiresClause));
   }
 
   for (auto *const TemplateParam : LExpr->getExplicitTemplateParameters())
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index c4e82678949ff..f1066139c8514 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -373,6 +373,8 @@ Bug Fixes to C++ Support
 - Improved fix for an issue with pack expansions of type constraints, where 
this
   now also works if the constraint has non-type or template template 
parameters.
   (#GH131798)
+- Fix crash when evaluating trailing requires clause of generic lambdas which 
are part of
+  a pack expansion.
 - Fixes matching of nested template template parameters. (#GH130362)
 - Correctly diagnoses template template paramters which have a pack parameter
   not in the last position.
diff --git a/clang/include/clang/AST/ASTNodeTraverser.h 
b/clang/include/clang/AST/ASTNodeTraverser.h
index f086d8134a64b..7bb435146f752 100644
--- a/clang/include/clang/AST/ASTNodeTraverser.h
+++ b/clang/include/clang/AST/ASTNodeTraverser.h
@@ -538,8 +538,8 @@ class ASTNodeTraverser
   for (const auto *Parameter : D->parameters())
 Visit(Parameter);
 
-if (const Expr *TRC = D->getTrailingRequiresClause())
-  Visit(TRC);
+if (const AssociatedConstraint &TRC = D->getTrailingRequiresClause())
+  Visit(TRC.ConstraintExpr);
 
 if (Traversal == TK_IgnoreUnlessSpelledInSource && D->isDefaulted())
   return;
diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 9e7e93d98c9d1..adf3634d205bc 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -81,13 +81,17 @@ enum class ImplicitParamKind;
 // Holds a constraint expression along with a pack expansion index, if
 // expanded.
 struct AssociatedConstraint {
-  const Expr *ConstraintExpr;
-  int ArgumentPackSubstitutionIndex;
+  const Expr *ConstraintExpr = nullptr;
+  int ArgumentPackSubstitutionIndex = -1;
+
+  constex

[llvm-branch-commits] [clang] [clang] Template Specialization Resugaring - Template Type Alias (PR #132442)

2025-04-05 Thread Matheus Izvekov via llvm-branch-commits

https://github.com/mizvekov updated 
https://github.com/llvm/llvm-project/pull/132442

>From 9d5d42820a4998e0e3eb74f7301aa34dca55b890 Mon Sep 17 00:00:00 2001
From: Matheus Izvekov 
Date: Mon, 30 May 2022 01:46:31 +0200
Subject: [PATCH] [clang] Template Specialization Resugaring - Template Type
 Alias

This implements an additional user of the resugaring transform:
the pattern of template type aliases.

For more details and discussion see:
https://discourse.llvm.org/t/rfc-improving-diagnostics-with-template-specialization-resugaring/64294

Differential Revision: https://reviews.llvm.org/D137199
---
 clang/include/clang/Sema/Sema.h   |  3 +-
 clang/lib/Sema/SemaCXXScopeSpec.cpp   |  3 +-
 clang/lib/Sema/SemaCoroutine.cpp  |  4 +-
 clang/lib/Sema/SemaDeclCXX.cpp|  6 ++-
 clang/lib/Sema/SemaTemplate.cpp   | 43 +++
 .../lib/Sema/SemaTemplateInstantiateDecl.cpp  |  3 +-
 clang/lib/Sema/TreeTransform.h|  3 +-
 clang/test/AST/ast-dump-template-decls.cpp|  4 +-
 clang/test/Sema/Resugar/resugar-types.cpp |  6 +--
 9 files changed, 44 insertions(+), 31 deletions(-)

diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 945ff5e2c2ca6..42a7bf75c3bfc 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -11509,7 +11509,8 @@ class Sema final : public SemaBase {
 
   void NoteAllFoundTemplates(TemplateName Name);
 
-  QualType CheckTemplateIdType(TemplateName Template,
+  QualType CheckTemplateIdType(const NestedNameSpecifier *NNS,
+   TemplateName Template,
SourceLocation TemplateLoc,
TemplateArgumentListInfo &TemplateArgs);
 
diff --git a/clang/lib/Sema/SemaCXXScopeSpec.cpp 
b/clang/lib/Sema/SemaCXXScopeSpec.cpp
index 1085639dcb355..1c7dff35bb8af 100644
--- a/clang/lib/Sema/SemaCXXScopeSpec.cpp
+++ b/clang/lib/Sema/SemaCXXScopeSpec.cpp
@@ -907,7 +907,8 @@ bool Sema::ActOnCXXNestedNameSpecifier(Scope *S,
 
   // We were able to resolve the template name to an actual template.
   // Build an appropriate nested-name-specifier.
-  QualType T = CheckTemplateIdType(Template, TemplateNameLoc, TemplateArgs);
+  QualType T = CheckTemplateIdType(SS.getScopeRep(), Template, TemplateNameLoc,
+   TemplateArgs);
   if (T.isNull())
 return true;
 
diff --git a/clang/lib/Sema/SemaCoroutine.cpp b/clang/lib/Sema/SemaCoroutine.cpp
index 75364a3b2c8b5..8dffbca7463dd 100644
--- a/clang/lib/Sema/SemaCoroutine.cpp
+++ b/clang/lib/Sema/SemaCoroutine.cpp
@@ -90,7 +90,7 @@ static QualType lookupPromiseType(Sema &S, const FunctionDecl 
*FD,
 
   // Build the template-id.
   QualType CoroTrait =
-  S.CheckTemplateIdType(TemplateName(CoroTraits), KwLoc, Args);
+  S.CheckTemplateIdType(nullptr, TemplateName(CoroTraits), KwLoc, Args);
   if (CoroTrait.isNull())
 return QualType();
   if (S.RequireCompleteType(KwLoc, CoroTrait,
@@ -169,7 +169,7 @@ static QualType lookupCoroutineHandleType(Sema &S, QualType 
PromiseType,
 
   // Build the template-id.
   QualType CoroHandleType =
-  S.CheckTemplateIdType(TemplateName(CoroHandle), Loc, Args);
+  S.CheckTemplateIdType(nullptr, TemplateName(CoroHandle), Loc, Args);
   if (CoroHandleType.isNull())
 return QualType();
   if (S.RequireCompleteType(Loc, CoroHandleType,
diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index 928bf47285490..8a9ad3271ec26 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -1140,7 +1140,8 @@ static bool lookupStdTypeTraitMember(Sema &S, 
LookupResult &TraitMemberLookup,
   }
 
   // Build the template-id.
-  QualType TraitTy = S.CheckTemplateIdType(TemplateName(TraitTD), Loc, Args);
+  QualType TraitTy =
+  S.CheckTemplateIdType(nullptr, TemplateName(TraitTD), Loc, Args);
   if (TraitTy.isNull())
 return true;
   if (!S.isCompleteType(Loc, TraitTy)) {
@@ -12163,7 +12164,8 @@ QualType Sema::BuildStdInitializerList(QualType 
Element, SourceLocation Loc) {

Context.getTrivialTypeSourceInfo(Element,
 Loc)));
 
-  QualType T = CheckTemplateIdType(TemplateName(StdInitializerList), Loc, 
Args);
+  QualType T =
+  CheckTemplateIdType(nullptr, TemplateName(StdInitializerList), Loc, 
Args);
   if (T.isNull())
 return QualType();
 
diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index 5652b4548895a..673551bd97f3e 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -3827,7 +3827,8 @@ void Sema::NoteAllFoundTemplates(TemplateName Name) {
   }
 }
 
-static QualType builtinCommonTypeImpl(Sema &S, TemplateName BaseTemplate,
+static QualType builtinCommonTypeImpl(Sema &S, const NestedNameSpecifier *NNS,
+ 

[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)

2025-04-05 Thread Ryotaro Kasuga via llvm-branch-commits


@@ -80,6 +80,21 @@ enum class RuleTy {
   ForVectorization,
 };
 
+/// Store the information about if corresponding direction vector was negated

kasuga-fj wrote:

> But I now guess that the complication here is the unique entries in the 
> dependency matrix, is that right?

Yes. (But holding two boolean values is a bit redundant. What is actually 
needed are three states. If both of them are false, it is an illegal state.)

> I am wondering if it isn't easier to keep all the entries and don't make them 
> unique?

I think it would be simpler. Also, there is no need to stop making entries 
unique altogether. If duplicate direction vectors are allowed, I think the 
simplest implementation would be to keep pairs of a direction vector and a 
boolean value indicating whether the corresponding vector is negated. However, 
I'm not sure how effective it is to make direction vectors unique. In the worst 
case, holding pairs of a vector and a boolean value instead of a single vector 
doubles the number of entries. Is this allowed?

https://github.com/llvm/llvm-project/pull/133672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)

2025-04-05 Thread Kai Nacke via llvm-branch-commits


@@ -0,0 +1,148 @@
+//===- MCGOFFSymbolMapper.h - Maps MC section/symbol to GOFF symbols 
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// Maps a section or a symbol to the GOFF symbols it is composed of, and their
+// attributes.
+//
+//===--===//
+
+#ifndef LLVM_MC_MCGOFFSYMBOLMAPPER_H
+#define LLVM_MC_MCGOFFSYMBOLMAPPER_H
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/BinaryFormat/GOFF.h"
+#include "llvm/Support/Alignment.h"
+#include 
+#include 
+
+namespace llvm {
+class MCAssembler;
+class MCContext;
+class MCSectionGOFF;
+
+// An "External Symbol Definition" in the GOFF file has a type, and depending 
on
+// the type a different subset of the fields is used.
+//
+// Unlike other formats, a 2 dimensional structure is used to define the
+// location of data. For example, the equivalent of the ELF .text section is
+// made up of a Section Definition (SD) and a class (Element Definition; ED).
+// The name of the SD symbol depends on the application, while the class has 
the
+// predefined name C_CODE64.

redstar wrote:

It's AMODE not ILP :-)

https://github.com/llvm/llvm-project/pull/133799
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [ARM] Speedups for CombineBaseUpdate. (#129725) (PR #130035)

2025-04-05 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@DanielKristofKiss ping

https://github.com/llvm/llvm-project/pull/130035
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [RISCV] Integrate RISCV target in baremetal toolchain object and deprecate RISCVToolchain object (PR #121831)

2025-04-05 Thread Garvit Gupta via llvm-branch-commits

https://github.com/quic-garvgupt edited 
https://github.com/llvm/llvm-project/pull/121831
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for bit shifts and sext-inreg (PR #132385)

2025-04-05 Thread Petar Avramovic via llvm-branch-commits

https://github.com/petar-avramovic ready_for_review 
https://github.com/llvm/llvm-project/pull/132385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for select (PR #132384)

2025-04-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Petar Avramovic (petar-avramovic)


Changes

Uniform condition S1 is AnyExtended to S32 and high bits are cleaned using
AND with 1. Divergent S1 uses VCC.
Using B32/B64 rules to cover scalars vector and pointer types.
Divergent B64 is split to S32.

---

Patch is 145.66 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/132384.diff


4 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp (+18-1) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp (+6-2) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir 
(+624-1277) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
index 7301cba9e8ed3..0f5f3545ac8eb 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
@@ -243,6 +243,22 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI,
 MI.eraseFromParent();
 break;
   }
+  case SplitTo32Sel: {
+Register Dst = MI.getOperand(0).getReg();
+LLT Ty = MRI.getType(Dst) == V4S16 ? V2S16 : S32;
+auto Op2 = B.buildUnmerge({VgprRB, Ty}, MI.getOperand(2).getReg());
+auto Op3 = B.buildUnmerge({VgprRB, Ty}, MI.getOperand(3).getReg());
+Register Cond = MI.getOperand(1).getReg();
+auto Flags = MI.getFlags();
+auto ResLo =
+B.buildSelect({VgprRB, Ty}, Cond, Op2.getReg(0), Op3.getReg(0), Flags);
+auto ResHi =
+B.buildSelect({VgprRB, Ty}, Cond, Op2.getReg(1), Op3.getReg(1), Flags);
+
+B.buildMergeLikeInstr(Dst, {ResLo, ResHi});
+MI.eraseFromParent();
+break;
+  }
   case Div_BFE: {
 Register Dst = MI.getOperand(0).getReg();
 assert(MRI.getType(Dst) == LLT::scalar(64));
@@ -453,7 +469,8 @@ LLT 
RegBankLegalizeHelper::getBTyFromID(RegBankLLTMappingApplyID ID, LLT Ty) {
   case UniInVgprB64:
 if (Ty == LLT::scalar(64) || Ty == LLT::fixed_vector(2, 32) ||
 Ty == LLT::fixed_vector(4, 16) || Ty == LLT::pointer(0, 64) ||
-Ty == LLT::pointer(1, 64) || Ty == LLT::pointer(4, 64))
+Ty == LLT::pointer(1, 64) || Ty == LLT::pointer(4, 64) ||
+Ty == LLT::pointer(999, 64))
   return Ty;
 return LLT();
   case SgprB96:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
index b4ef4ecc3fe28..96b0a7d634f7e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
@@ -198,7 +198,7 @@ UniformityLLTOpPredicateID LLTToBId(LLT Ty) {
 return B32;
   if (Ty == LLT::scalar(64) || Ty == LLT::fixed_vector(2, 32) ||
   Ty == LLT::fixed_vector(4, 16) || Ty == LLT::pointer(1, 64) ||
-  Ty == LLT::pointer(4, 64))
+  Ty == LLT::pointer(4, 64) || Ty == LLT::pointer(999, 64))
 return B64;
   if (Ty == LLT::fixed_vector(3, 32))
 return B96;
@@ -485,8 +485,12 @@ RegBankLegalizeRules::RegBankLegalizeRules(const 
GCNSubtarget &_ST,
   addRulesForGOpcs({G_BR}).Any({{_}, {{}, {None}}});
 
   addRulesForGOpcs({G_SELECT}, StandardB)
+  .Any({{DivS16}, {{Vgpr16}, {Vcc, Vgpr16, Vgpr16}}})
+  .Any({{UniS16}, {{Sgpr16}, {Sgpr32AExtBoolInReg, Sgpr16, Sgpr16}}})
   .Div(B32, {{VgprB32}, {Vcc, VgprB32, VgprB32}})
-  .Uni(B32, {{SgprB32}, {Sgpr32AExtBoolInReg, SgprB32, SgprB32}});
+  .Uni(B32, {{SgprB32}, {Sgpr32AExtBoolInReg, SgprB32, SgprB32}})
+  .Div(B64, {{VgprB64}, {Vcc, VgprB64, VgprB64}, SplitTo32Sel})
+  .Uni(B64, {{SgprB64}, {Sgpr32AExtBoolInReg, SgprB64, SgprB64}});
 
   addRulesForGOpcs({G_ANYEXT})
   .Any({{UniS16, S1}, {{None}, {None}}}) // should be combined away
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h
index cdf70d99d4a9e..058e58c1a94ce 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h
@@ -177,6 +177,7 @@ enum LoweringMethodID {
   Div_BFE,
   VgprToVccCopy,
   SplitTo32,
+  SplitTo32Sel,
   Ext32To64,
   UniCstExt,
   SplitLoad,
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir 
b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir
index 810724dab685d..762f7b9500367 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir
@@ -1,6 +1,5 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=amdgcn -mcpu=fiji -run-pass=regbankselect -global-isel %s 
-verify-machineinstrs -o - -regbankselect-fast | FileCheck -check-prefix=FAST %s
-# RUN: llc -mtriple=amdgcn -mcpu=fiji -run-pass=regbankselect -global-isel %s 
-verify-machineinstrs -o - -regban

[llvm-branch-commits] [clang] release/20.x: [modules] Handle friend function that was a definition but became only a declaration during AST deserialization (#132214) (PR #134232)

2025-04-05 Thread Dmitry Polukhin via llvm-branch-commits

https://github.com/dmpolukhin created 
https://github.com/llvm/llvm-project/pull/134232

Fix for regression https://github.com/llvm/llvm-project/issues/130917, changes 
in https://github.com/llvm/llvm-project/pull/111992 were too broad. This change 
reduces scope of previous fix. Added 
`ExternalASTSource::wasThisDeclarationADefinition` to detect cases when 
FunctionDecl lost body due to declaration merges.

>From 73ed00f5ef37fc19495bee13d0366fe093c5ac10 Mon Sep 17 00:00:00 2001
From: Dmitry Polukhin <34227995+dmpoluk...@users.noreply.github.com>
Date: Thu, 3 Apr 2025 08:27:13 +0100
Subject: [PATCH 1/2] [modules] Handle friend function that was a definition
 but became only a declaration during AST deserialization (#132214)

Fix for regression #130917, changes in #111992 were too broad. This change 
reduces scope of previous fix. Added 
`ExternalASTSource::wasThisDeclarationADefinition` to detect cases when 
FunctionDecl lost body due to declaration merges.
---
 clang/include/clang/AST/ExternalASTSource.h   |  4 ++
 .../clang/Sema/MultiplexExternalSemaSource.h  |  2 +
 clang/include/clang/Serialization/ASTReader.h |  6 +++
 clang/lib/AST/ExternalASTSource.cpp   |  4 ++
 .../lib/Sema/MultiplexExternalSemaSource.cpp  |  8 
 .../lib/Sema/SemaTemplateInstantiateDecl.cpp  | 12 +++---
 clang/lib/Serialization/ASTReader.cpp |  4 ++
 clang/lib/Serialization/ASTReaderDecl.cpp |  3 ++
 .../friend-default-parameters-modules.cpp | 39 +++
 .../SemaCXX/friend-default-parameters.cpp | 21 ++
 10 files changed, 98 insertions(+), 5 deletions(-)
 create mode 100644 clang/test/SemaCXX/friend-default-parameters-modules.cpp
 create mode 100644 clang/test/SemaCXX/friend-default-parameters.cpp

diff --git a/clang/include/clang/AST/ExternalASTSource.h 
b/clang/include/clang/AST/ExternalASTSource.h
index 42aed56d42e07..f45e3af7602c1 100644
--- a/clang/include/clang/AST/ExternalASTSource.h
+++ b/clang/include/clang/AST/ExternalASTSource.h
@@ -191,6 +191,10 @@ class ExternalASTSource : public 
RefCountedBase {
 
   virtual ExtKind hasExternalDefinitions(const Decl *D);
 
+  /// True if this function declaration was a definition before in its own
+  /// module.
+  virtual bool wasThisDeclarationADefinition(const FunctionDecl *FD);
+
   /// Finds all declarations lexically contained within the given
   /// DeclContext, after applying an optional filter predicate.
   ///
diff --git a/clang/include/clang/Sema/MultiplexExternalSemaSource.h 
b/clang/include/clang/Sema/MultiplexExternalSemaSource.h
index 921bebe3a44af..391c2177d75ec 100644
--- a/clang/include/clang/Sema/MultiplexExternalSemaSource.h
+++ b/clang/include/clang/Sema/MultiplexExternalSemaSource.h
@@ -92,6 +92,8 @@ class MultiplexExternalSemaSource : public ExternalSemaSource 
{
 
   ExtKind hasExternalDefinitions(const Decl *D) override;
 
+  bool wasThisDeclarationADefinition(const FunctionDecl *FD) override;
+
   /// Find all declarations with the given name in the
   /// given context.
   bool FindExternalVisibleDeclsByName(const DeclContext *DC,
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index 47301419c76c6..23c98282f228f 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -1392,6 +1392,10 @@ class ASTReader
 
   llvm::DenseMap DefinitionSource;
 
+  /// Friend functions that were defined but might have had their bodies
+  /// removed.
+  llvm::DenseSet ThisDeclarationWasADefinitionSet;
+
   bool shouldDisableValidationForFile(const serialization::ModuleFile &M) 
const;
 
   /// Reads a statement from the specified cursor.
@@ -2375,6 +2379,8 @@ class ASTReader
 
   ExtKind hasExternalDefinitions(const Decl *D) override;
 
+  bool wasThisDeclarationADefinition(const FunctionDecl *FD) override;
+
   /// Retrieve a selector from the given module with its local ID
   /// number.
   Selector getLocalSelector(ModuleFile &M, unsigned LocalID);
diff --git a/clang/lib/AST/ExternalASTSource.cpp 
b/clang/lib/AST/ExternalASTSource.cpp
index e2451f294741d..3e865cb7679b5 100644
--- a/clang/lib/AST/ExternalASTSource.cpp
+++ b/clang/lib/AST/ExternalASTSource.cpp
@@ -38,6 +38,10 @@ ExternalASTSource::hasExternalDefinitions(const Decl *D) {
   return EK_ReplyHazy;
 }
 
+bool ExternalASTSource::wasThisDeclarationADefinition(const FunctionDecl *FD) {
+  return false;
+}
+
 void ExternalASTSource::FindFileRegionDecls(FileID File, unsigned Offset,
 unsigned Length,
 SmallVectorImpl &Decls) {}
diff --git a/clang/lib/Sema/MultiplexExternalSemaSource.cpp 
b/clang/lib/Sema/MultiplexExternalSemaSource.cpp
index 6d945300c386c..fbfb242598c24 100644
--- a/clang/lib/Sema/MultiplexExternalSemaSource.cpp
+++ b/clang/lib/Sema/MultiplexExternalSemaSource.cpp
@@ -107,6 +107,14 @@ MultiplexExternalSemaSource::hasExternalDefinitions(const 
Decl

[llvm-branch-commits] [clang] [clang][HeuristicResolver] Default argument heuristic for template parameters (PR #131074)

2025-04-05 Thread Nathan Ridge via llvm-branch-commits

https://github.com/HighCommander4 updated 
https://github.com/llvm/llvm-project/pull/131074

>From 556926d2644160405958a5d01963714f97ab522e Mon Sep 17 00:00:00 2001
From: Nathan Ridge 
Date: Thu, 13 Mar 2025 01:23:03 -0400
Subject: [PATCH] [clang][HeuristicResolver] Default argument heuristic for
 template parameters

---
 clang/lib/Sema/HeuristicResolver.cpp  | 17 ++
 .../unittests/Sema/HeuristicResolverTest.cpp  | 34 +++
 2 files changed, 51 insertions(+)

diff --git a/clang/lib/Sema/HeuristicResolver.cpp 
b/clang/lib/Sema/HeuristicResolver.cpp
index d377379c627db..7c88a3097a044 100644
--- a/clang/lib/Sema/HeuristicResolver.cpp
+++ b/clang/lib/Sema/HeuristicResolver.cpp
@@ -11,7 +11,9 @@
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/DeclTemplate.h"
 #include "clang/AST/ExprCXX.h"
+#include "clang/AST/TemplateBase.h"
 #include "clang/AST/Type.h"
+#include "llvm/Support/Casting.h"
 
 namespace clang {
 
@@ -122,6 +124,7 @@ TemplateName getReferencedTemplateName(const Type *T) {
 // resolves it to a CXXRecordDecl in which we can try name lookup.
 TagDecl *HeuristicResolverImpl::resolveTypeToTagDecl(QualType QT) {
   const Type *T = QT.getTypePtrOrNull();
+
   if (!T)
 return nullptr;
 
@@ -245,6 +248,20 @@ QualType HeuristicResolverImpl::simplifyType(QualType 
Type, const Expr *E,
 }
   }
 }
+if (const auto *TTPT = dyn_cast_if_present(T.Type)) {
+  // We can't do much useful with a template parameter (e.g. we cannot look
+  // up member names inside it). However, if the template parameter has a
+  // default argument, as a heuristic we can replace T with the default
+  // argument type.
+  if (const auto *TTPD = TTPT->getDecl()) {
+if (TTPD->hasDefaultArgument()) {
+  const auto &DefaultArg = TTPD->getDefaultArgument().getArgument();
+  if (DefaultArg.getKind() == TemplateArgument::Type) {
+return {DefaultArg.getAsType()};
+  }
+}
+  }
+}
 return T;
   };
   // As an additional protection against infinite loops, bound the number of
diff --git a/clang/unittests/Sema/HeuristicResolverTest.cpp 
b/clang/unittests/Sema/HeuristicResolverTest.cpp
index c7cfe7917c532..f7eb4b23c2ab0 100644
--- a/clang/unittests/Sema/HeuristicResolverTest.cpp
+++ b/clang/unittests/Sema/HeuristicResolverTest.cpp
@@ -410,6 +410,40 @@ TEST(HeuristicResolver, MemberExpr_HangIssue126536) {
   cxxDependentScopeMemberExpr(hasMemberName("foo")).bind("input"));
 }
 
+TEST(HeuristicResolver, MemberExpr_DefaultTemplateArgument) {
+  std::string Code = R"cpp(
+struct Default {
+  void foo();
+};
+template 
+void bar(T t) {
+  t.foo();
+}
+  )cpp";
+  // Test resolution of "foo" in "t.foo()".
+  expectResolution(
+  Code, &HeuristicResolver::resolveMemberExpr,
+  cxxDependentScopeMemberExpr(hasMemberName("foo")).bind("input"),
+  cxxMethodDecl(hasName("foo")).bind("output"));
+}
+
+TEST(HeuristicResolver, MemberExpr_DefaultTemplateArgument_Recursive) {
+  std::string Code = R"cpp(
+struct Default {
+  void foo();
+};
+template 
+void bar(T t) {
+  t.foo();
+}
+  )cpp";
+  // Test resolution of "foo" in "t.foo()".
+  expectResolution(
+  Code, &HeuristicResolver::resolveMemberExpr,
+  cxxDependentScopeMemberExpr(hasMemberName("foo")).bind("input"),
+  cxxMethodDecl(hasName("foo")).bind("output"));
+}
+
 TEST(HeuristicResolver, DeclRefExpr_StaticMethod) {
   std::string Code = R"cpp(
 template 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [clang-format] Allow `Language: Cpp` for C files (#133033) (PR #133216)

2025-04-05 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/133216

>From c1c4d7191d7078216b9c8793e46fff84a8c7a02d Mon Sep 17 00:00:00 2001
From: Owen Pan 
Date: Thu, 27 Mar 2025 01:00:02 -0700
Subject: [PATCH] [clang-format] Allow `Language: Cpp` for C files (#133033)

Fix #132832

(cherry picked from commit 05fb8408de23c3ccb6125b6886742177755bd757)
---
 clang/lib/Format/Format.cpp| 18 ++
 clang/unittests/Format/ConfigParseTest.cpp | 20 
 2 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 0bb8545884442..768e655f65ce7 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -2114,10 +2114,14 @@ std::error_code 
parseConfiguration(llvm::MemoryBufferRef Config,
   FormatStyle::FormatStyleSet StyleSet;
   bool LanguageFound = false;
   for (const FormatStyle &Style : llvm::reverse(Styles)) {
-if (Style.Language != FormatStyle::LK_None)
+const auto Lang = Style.Language;
+if (Lang != FormatStyle::LK_None)
   StyleSet.Add(Style);
-if (Style.Language == Language)
+if (Lang == Language ||
+// For backward compatibility.
+(Lang == FormatStyle::LK_Cpp && Language == FormatStyle::LK_C)) {
   LanguageFound = true;
+}
   }
   if (!LanguageFound) {
 if (Styles.empty() || Styles[0].Language != FormatStyle::LK_None)
@@ -2157,8 +2161,14 @@ 
FormatStyle::FormatStyleSet::Get(FormatStyle::LanguageKind Language) const {
   if (!Styles)
 return std::nullopt;
   auto It = Styles->find(Language);
-  if (It == Styles->end())
-return std::nullopt;
+  if (It == Styles->end()) {
+if (Language != FormatStyle::LK_C)
+  return std::nullopt;
+// For backward compatibility.
+It = Styles->find(FormatStyle::LK_Cpp);
+if (It == Styles->end())
+  return std::nullopt;
+  }
   FormatStyle Style = It->second;
   Style.StyleSet = *this;
   return Style;
diff --git a/clang/unittests/Format/ConfigParseTest.cpp 
b/clang/unittests/Format/ConfigParseTest.cpp
index 10788449a1a1d..fcf07e660ddb6 100644
--- a/clang/unittests/Format/ConfigParseTest.cpp
+++ b/clang/unittests/Format/ConfigParseTest.cpp
@@ -1214,6 +1214,26 @@ TEST(ConfigParseTest, ParsesConfigurationWithLanguages) {
   IndentWidth, 56u);
 }
 
+TEST(ConfigParseTest, AllowCppForC) {
+  FormatStyle Style = {};
+  Style.Language = FormatStyle::LK_C;
+  EXPECT_EQ(parseConfiguration("Language: Cpp", &Style), ParseError::Success);
+
+  CHECK_PARSE("---\n"
+  "IndentWidth: 4\n"
+  "---\n"
+  "Language: Cpp\n"
+  "IndentWidth: 8\n",
+  IndentWidth, 8u);
+
+  EXPECT_EQ(parseConfiguration("---\n"
+   "Language: ObjC\n"
+   "---\n"
+   "Language: Cpp\n",
+   &Style),
+ParseError::Success);
+}
+
 TEST(ConfigParseTest, UsesLanguageForBasedOnStyle) {
   FormatStyle Style = {};
   Style.Language = FormatStyle::LK_JavaScript;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] release/20.x: [Hexagon] Set the default compilation target to V68 (#125239) (PR #128597)

2025-04-05 Thread Ikhlas Ajbar via llvm-branch-commits

iajbar wrote:

> > Given 20.1.1 was just released, is the plan still to get this one into 
> > 20.x? (Just asking to know whether we should make a corresponding change in 
> > Zig.)
> 
> @quic-akaryaki @iajbar - can we get this/these changes in?

Yes, please.

https://github.com/llvm/llvm-project/pull/128597
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)

2025-04-05 Thread Kai Nacke via llvm-branch-commits

redstar wrote:

I implemented the suggestion from @uweigand. The GOFF attributes are set 
directly at the `MCSectionGOFF`, and the `GOFFSymbolMapper` is gone. I still 
need to update a couple of tests, since now the section names have changed.

https://github.com/llvm/llvm-project/pull/133799
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] c1c4d71 - [clang-format] Allow `Language: Cpp` for C files (#133033)

2025-04-05 Thread Tom Stellard via llvm-branch-commits

Author: Owen Pan
Date: 2025-03-28T23:14:52-07:00
New Revision: c1c4d7191d7078216b9c8793e46fff84a8c7a02d

URL: 
https://github.com/llvm/llvm-project/commit/c1c4d7191d7078216b9c8793e46fff84a8c7a02d
DIFF: 
https://github.com/llvm/llvm-project/commit/c1c4d7191d7078216b9c8793e46fff84a8c7a02d.diff

LOG: [clang-format] Allow `Language: Cpp` for C files (#133033)

Fix #132832

(cherry picked from commit 05fb8408de23c3ccb6125b6886742177755bd757)

Added: 


Modified: 
clang/lib/Format/Format.cpp
clang/unittests/Format/ConfigParseTest.cpp

Removed: 




diff  --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 0bb8545884442..768e655f65ce7 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -2114,10 +2114,14 @@ std::error_code 
parseConfiguration(llvm::MemoryBufferRef Config,
   FormatStyle::FormatStyleSet StyleSet;
   bool LanguageFound = false;
   for (const FormatStyle &Style : llvm::reverse(Styles)) {
-if (Style.Language != FormatStyle::LK_None)
+const auto Lang = Style.Language;
+if (Lang != FormatStyle::LK_None)
   StyleSet.Add(Style);
-if (Style.Language == Language)
+if (Lang == Language ||
+// For backward compatibility.
+(Lang == FormatStyle::LK_Cpp && Language == FormatStyle::LK_C)) {
   LanguageFound = true;
+}
   }
   if (!LanguageFound) {
 if (Styles.empty() || Styles[0].Language != FormatStyle::LK_None)
@@ -2157,8 +2161,14 @@ 
FormatStyle::FormatStyleSet::Get(FormatStyle::LanguageKind Language) const {
   if (!Styles)
 return std::nullopt;
   auto It = Styles->find(Language);
-  if (It == Styles->end())
-return std::nullopt;
+  if (It == Styles->end()) {
+if (Language != FormatStyle::LK_C)
+  return std::nullopt;
+// For backward compatibility.
+It = Styles->find(FormatStyle::LK_Cpp);
+if (It == Styles->end())
+  return std::nullopt;
+  }
   FormatStyle Style = It->second;
   Style.StyleSet = *this;
   return Style;

diff  --git a/clang/unittests/Format/ConfigParseTest.cpp 
b/clang/unittests/Format/ConfigParseTest.cpp
index 10788449a1a1d..fcf07e660ddb6 100644
--- a/clang/unittests/Format/ConfigParseTest.cpp
+++ b/clang/unittests/Format/ConfigParseTest.cpp
@@ -1214,6 +1214,26 @@ TEST(ConfigParseTest, ParsesConfigurationWithLanguages) {
   IndentWidth, 56u);
 }
 
+TEST(ConfigParseTest, AllowCppForC) {
+  FormatStyle Style = {};
+  Style.Language = FormatStyle::LK_C;
+  EXPECT_EQ(parseConfiguration("Language: Cpp", &Style), ParseError::Success);
+
+  CHECK_PARSE("---\n"
+  "IndentWidth: 4\n"
+  "---\n"
+  "Language: Cpp\n"
+  "IndentWidth: 8\n",
+  IndentWidth, 8u);
+
+  EXPECT_EQ(parseConfiguration("---\n"
+   "Language: ObjC\n"
+   "---\n"
+   "Language: Cpp\n",
+   &Style),
+ParseError::Success);
+}
+
 TEST(ConfigParseTest, UsesLanguageForBasedOnStyle) {
   FormatStyle Style = {};
   Style.Language = FormatStyle::LK_JavaScript;



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [modules] Handle friend function that was a definition but became only a declaration during AST deserialization (#132214) (PR #134232)

2025-04-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-modules

Author: Dmitry Polukhin (dmpolukhin)


Changes

Fix for regression https://github.com/llvm/llvm-project/issues/130917, changes 
in https://github.com/llvm/llvm-project/pull/111992 were too broad. This change 
reduces scope of previous fix. Added 
`ExternalASTSource::wasThisDeclarationADefinition` to detect cases when 
FunctionDecl lost body due to declaration merges.

---
Full diff: https://github.com/llvm/llvm-project/pull/134232.diff


11 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+1) 
- (modified) clang/include/clang/AST/ExternalASTSource.h (+4) 
- (modified) clang/include/clang/Sema/MultiplexExternalSemaSource.h (+2) 
- (modified) clang/include/clang/Serialization/ASTReader.h (+6) 
- (modified) clang/lib/AST/ExternalASTSource.cpp (+4) 
- (modified) clang/lib/Sema/MultiplexExternalSemaSource.cpp (+8) 
- (modified) clang/lib/Sema/SemaTemplateInstantiateDecl.cpp (+7-5) 
- (modified) clang/lib/Serialization/ASTReader.cpp (+4) 
- (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+3) 
- (added) clang/test/SemaCXX/friend-default-parameters-modules.cpp (+39) 
- (added) clang/test/SemaCXX/friend-default-parameters.cpp (+21) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index f4befc242f28b..e57fa9786e6f2 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1065,6 +1065,7 @@ Bug Fixes to C++ Support
 - Fixed an incorrect pointer access when checking access-control on concepts. 
(#GH131530)
 - Fixed various alias CTAD bugs involving variadic template arguments. 
(#GH123591), (#GH127539), (#GH129077),
   (#GH129620), and (#GH129998).
+- Fixed the false compilation error "redefinition of default argument" for 
friend functions with default parameters. (#GH130917)
 
 Bug Fixes to AST Handling
 ^
diff --git a/clang/include/clang/AST/ExternalASTSource.h 
b/clang/include/clang/AST/ExternalASTSource.h
index 42aed56d42e07..f45e3af7602c1 100644
--- a/clang/include/clang/AST/ExternalASTSource.h
+++ b/clang/include/clang/AST/ExternalASTSource.h
@@ -191,6 +191,10 @@ class ExternalASTSource : public 
RefCountedBase {
 
   virtual ExtKind hasExternalDefinitions(const Decl *D);
 
+  /// True if this function declaration was a definition before in its own
+  /// module.
+  virtual bool wasThisDeclarationADefinition(const FunctionDecl *FD);
+
   /// Finds all declarations lexically contained within the given
   /// DeclContext, after applying an optional filter predicate.
   ///
diff --git a/clang/include/clang/Sema/MultiplexExternalSemaSource.h 
b/clang/include/clang/Sema/MultiplexExternalSemaSource.h
index 921bebe3a44af..391c2177d75ec 100644
--- a/clang/include/clang/Sema/MultiplexExternalSemaSource.h
+++ b/clang/include/clang/Sema/MultiplexExternalSemaSource.h
@@ -92,6 +92,8 @@ class MultiplexExternalSemaSource : public ExternalSemaSource 
{
 
   ExtKind hasExternalDefinitions(const Decl *D) override;
 
+  bool wasThisDeclarationADefinition(const FunctionDecl *FD) override;
+
   /// Find all declarations with the given name in the
   /// given context.
   bool FindExternalVisibleDeclsByName(const DeclContext *DC,
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index 47301419c76c6..23c98282f228f 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -1392,6 +1392,10 @@ class ASTReader
 
   llvm::DenseMap DefinitionSource;
 
+  /// Friend functions that were defined but might have had their bodies
+  /// removed.
+  llvm::DenseSet ThisDeclarationWasADefinitionSet;
+
   bool shouldDisableValidationForFile(const serialization::ModuleFile &M) 
const;
 
   /// Reads a statement from the specified cursor.
@@ -2375,6 +2379,8 @@ class ASTReader
 
   ExtKind hasExternalDefinitions(const Decl *D) override;
 
+  bool wasThisDeclarationADefinition(const FunctionDecl *FD) override;
+
   /// Retrieve a selector from the given module with its local ID
   /// number.
   Selector getLocalSelector(ModuleFile &M, unsigned LocalID);
diff --git a/clang/lib/AST/ExternalASTSource.cpp 
b/clang/lib/AST/ExternalASTSource.cpp
index e2451f294741d..3e865cb7679b5 100644
--- a/clang/lib/AST/ExternalASTSource.cpp
+++ b/clang/lib/AST/ExternalASTSource.cpp
@@ -38,6 +38,10 @@ ExternalASTSource::hasExternalDefinitions(const Decl *D) {
   return EK_ReplyHazy;
 }
 
+bool ExternalASTSource::wasThisDeclarationADefinition(const FunctionDecl *FD) {
+  return false;
+}
+
 void ExternalASTSource::FindFileRegionDecls(FileID File, unsigned Offset,
 unsigned Length,
 SmallVectorImpl &Decls) {}
diff --git a/clang/lib/Sema/MultiplexExternalSemaSource.cpp 
b/clang/lib/Sema/MultiplexExternalSemaSource.cpp
index 6d945300c386c..fbfb242598c24 100644
--- a/clang/lib/Sema/MultiplexExternalSemaSource.cpp
+++ b/clang

[llvm-branch-commits] [BOLT][NFC] Pre-disasm metadata rewriters (PR #132113)

2025-04-05 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov ready_for_review 
https://github.com/llvm/llvm-project/pull/132113
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Support image_bvh8_intersect_ray instruction and intrinsic. (PR #130041)

2025-04-05 Thread Mirko Brkušanin via llvm-branch-commits


@@ -1509,18 +1509,18 @@ multiclass MIMG_Gather 
 : MIMG_Gather;
 
-class MIMG_IntersectRay_Helper {
-  int num_addrs = !if(Is64, !if(IsA16, 9, 12), !if(IsA16, 8, 11));
+class MIMG_IntersectRay_Helper {
+  int num_addrs = !if(isBVH8, 11, !if(Is64, !if(IsA16, 9, 12), !if(IsA16, 8, 
11)));
   RegisterClass RegClass = MIMGAddrSize.RegClass;
   int VAddrDwords = !srl(RegClass.Size, 5);
 
   int GFX11PlusNSAAddrs = !if(IsA16, 4, 5);
   RegisterClass node_ptr_type = !if(Is64, VReg_64, VGPR_32);
   list GFX11PlusAddrTypes =
-!if(isDual, [VReg_64, VReg_64, VReg_96, VReg_96, VReg_64],
- !if(IsA16,
-  [node_ptr_type, VGPR_32, VReg_96, VReg_96],
-  [node_ptr_type, VGPR_32, VReg_96, VReg_96, VReg_96]));
+ !cond(!eq(isBVH8, 1) : [node_ptr_type, VReg_64, VReg_96, VReg_96, 
VGPR_32],
+   !eq(isDual, 1) : [node_ptr_type, VReg_64, VReg_96, VReg_96, 
VReg_64],
+   !eq(IsA16,  0) : [node_ptr_type, VGPR_32, VReg_96, VReg_96, 
VReg_96],
+   !eq(IsA16,  1) : [node_ptr_type, VGPR_32, VReg_96, VReg_96]);

mbrkusanin wrote:

```suggestion
 !cond(isBVH8 : [node_ptr_type, VReg_64, VReg_96, VReg_96, VGPR_32],
   isDual : [node_ptr_type, VReg_64, VReg_96, VReg_96, VReg_64],
   IsA16  : [node_ptr_type, VGPR_32, VReg_96, VReg_96],
   true   : [node_ptr_type, VGPR_32, VReg_96, VReg_96, VReg_96]);
```

!eq(X, 1) is redundant here, and last two options can be swapped

https://github.com/llvm/llvm-project/pull/130041
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GlobalISel] Combine redundant sext_inreg (PR #131624)

2025-04-05 Thread Pierre van Houtryve via llvm-branch-commits

https://github.com/Pierre-vh updated 
https://github.com/llvm/llvm-project/pull/131624

>From 3f3c67934d0c9ea34c11cbd24becc24541baf567 Mon Sep 17 00:00:00 2001
From: pvanhout 
Date: Mon, 17 Mar 2025 13:54:59 +0100
Subject: [PATCH 1/2] [GlobalISel] Combine redundant sext_inreg

---
 .../llvm/CodeGen/GlobalISel/CombinerHelper.h  |   3 +
 .../include/llvm/Target/GlobalISel/Combine.td |   9 +-
 .../GlobalISel/CombinerHelperCasts.cpp|  27 +++
 .../combine-redundant-sext-inreg.mir  | 164 ++
 .../combine-sext-trunc-sextinreg.mir  |  87 ++
 .../CodeGen/AMDGPU/GlobalISel/llvm.abs.ll |   5 -
 6 files changed, 289 insertions(+), 6 deletions(-)
 create mode 100644 
llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir
 create mode 100644 
llvm/test/CodeGen/AMDGPU/GlobalISel/combine-sext-trunc-sextinreg.mir

diff --git a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h 
b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
index 9b78342c8fc39..5778377d125a8 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
@@ -994,6 +994,9 @@ class CombinerHelper {
   // overflow sub
   bool matchSuboCarryOut(const MachineInstr &MI, BuildFnTy &MatchInfo) const;
 
+  // (sext_inreg (sext_inreg x, K0), K1)
+  void applyRedundantSextInReg(MachineInstr &Root, MachineInstr &Other) const;
+
 private:
   /// Checks for legality of an indexed variant of \p LdSt.
   bool isIndexedLoadStoreLegal(GLoadStore &LdSt) const;
diff --git a/llvm/include/llvm/Target/GlobalISel/Combine.td 
b/llvm/include/llvm/Target/GlobalISel/Combine.td
index 660b03080f92e..6a0ff683a4647 100644
--- a/llvm/include/llvm/Target/GlobalISel/Combine.td
+++ b/llvm/include/llvm/Target/GlobalISel/Combine.td
@@ -1849,6 +1849,12 @@ def anyext_of_anyext : ext_of_ext_opcodes;
 def anyext_of_zext : ext_of_ext_opcodes;
 def anyext_of_sext : ext_of_ext_opcodes;
 
+def sext_inreg_of_sext_inreg : GICombineRule<
+   (defs root:$dst),
+   (match (G_SEXT_INREG $x, $src, $a):$other,
+  (G_SEXT_INREG $dst, $x, $b):$root),
+   (apply [{ Helper.applyRedundantSextInReg(*${root}, *${other}); }])>;
+
 // Push cast through build vector.
 class buildvector_of_opcode : GICombineRule <
   (defs root:$root, build_fn_matchinfo:$matchinfo),
@@ -1896,7 +1902,8 @@ def cast_of_cast_combines: GICombineGroup<[
   sext_of_anyext,
   anyext_of_anyext,
   anyext_of_zext,
-  anyext_of_sext
+  anyext_of_sext,
+  sext_inreg_of_sext_inreg,
 ]>;
 
 def cast_combines: GICombineGroup<[
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp 
b/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp
index 576fd5fd81703..883a62c308232 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp
@@ -378,3 +378,30 @@ bool CombinerHelper::matchCastOfInteger(const MachineInstr 
&CastMI,
 return false;
   }
 }
+
+void CombinerHelper::applyRedundantSextInReg(MachineInstr &Root,
+ MachineInstr &Other) const {
+  assert(Root.getOpcode() == TargetOpcode::G_SEXT_INREG &&
+ Other.getOpcode() == TargetOpcode::G_SEXT_INREG);
+
+  unsigned RootWidth = Root.getOperand(2).getImm();
+  unsigned OtherWidth = Other.getOperand(2).getImm();
+
+  Register Dst = Root.getOperand(0).getReg();
+  Register OtherDst = Other.getOperand(0).getReg();
+  Register Src = Other.getOperand(1).getReg();
+
+  if (RootWidth >= OtherWidth) {
+// The root sext_inreg is entirely redundant because the other one
+// is narrower.
+Observer.changingAllUsesOfReg(MRI, Dst);
+MRI.replaceRegWith(Dst, OtherDst);
+Observer.finishedChangingAllUsesOfReg();
+  } else {
+// RootWidth < OtherWidth, rewrite this G_SEXT_INREG with the source of the
+// other G_SEXT_INREG.
+Builder.buildSExtInReg(Dst, Src, RootWidth);
+  }
+
+  Root.eraseFromParent();
+}
diff --git 
a/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir 
b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir
new file mode 100644
index 0..566ee8e6c338d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-redundant-sext-inreg.mir
@@ -0,0 +1,164 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 
-run-pass=amdgpu-regbank-combiner -verify-machineinstrs %s -o - | FileCheck %s
+
+---
+name: inreg8_inreg16
+tracksRegLiveness: true
+body: |
+  bb.0:
+liveins: $vgpr0
+; CHECK-LABEL: name: inreg8_inreg16
+; CHECK: liveins: $vgpr0
+; CHECK-NEXT: {{  $}}
+; CHECK-NEXT: %copy:_(s32) = COPY $vgpr0
+; CHECK-NEXT: %inreg:_(s32) = G_SEXT_INREG %copy, 8
+; CHECK-NEXT: $vgpr0 = COPY %inreg(s32)
+%copy:_(s32) = COPY $vgpr0
+%inreg:_(s32) = G_SEXT_INREG %copy, 8
+%inreg1:_(s32) = G_SEXT_INREG %inreg, 16
+$vgpr0 = COPY %inreg1
+...
+
+

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor analysis of RET instructions (PR #131897)

2025-04-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/131897

>From 136dc3d8728a3511bd524d416059c289f0118100 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 17 Mar 2025 19:28:25 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: refactor analysis of RET
 instructions

In preparation for implementing detection of more gadget kinds,
refactor checking for non-protected return instructions.
---
 bolt/include/bolt/Passes/PAuthGadgetScanner.h |  23 ++-
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 138 ++
 2 files changed, 95 insertions(+), 66 deletions(-)

diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index 2d8109f8ca43b..f102f1080e2e8 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -199,19 +199,34 @@ struct Report {
   virtual void generateReport(raw_ostream &OS,
   const BinaryContext &BC) const = 0;
 
+  virtual const ArrayRef getAffectedRegisters() const { return {}; }
+  virtual void
+  setOverwritingInstrs(const std::vector &Instrs) {}
+
   void printBasicInfo(raw_ostream &OS, const BinaryContext &BC,
   StringRef IssueKind) const;
 };
 
 struct GadgetReport : public Report {
   const GadgetKind &Kind;
+  SmallVector AffectedRegisters;
   std::vector OverwritingInstrs;
 
   GadgetReport(const GadgetKind &Kind, MCInstReference Location,
-   std::vector OverwritingInstrs)
-  : Report(Location), Kind(Kind), OverwritingInstrs(OverwritingInstrs) {}
+   const BitVector &AffectedRegisters)
+  : Report(Location), Kind(Kind),
+AffectedRegisters(AffectedRegisters.set_bits()) {}
 
   void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
+
+  const ArrayRef getAffectedRegisters() const override {
+return AffectedRegisters;
+  }
+
+  void
+  setOverwritingInstrs(const std::vector &Instrs) override {
+OverwritingInstrs = Instrs;
+  }
 };
 
 /// Report with a free-form message attached.
@@ -224,7 +239,6 @@ struct GenericReport : public Report {
 };
 
 struct FunctionAnalysisResult {
-  SmallSet RegistersAffected;
   std::vector> Diagnostics;
 };
 
@@ -232,8 +246,7 @@ class Analysis : public BinaryFunctionPass {
   void runOnFunction(BinaryFunction &Function,
  MCPlusBuilder::AllocatorIdTy AllocatorId);
   FunctionAnalysisResult
-  computeDfState(PacRetAnalysis &PRA, BinaryFunction &BF,
- MCPlusBuilder::AllocatorIdTy AllocatorId);
+  computeDfState(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId);
 
   std::map AnalysisResults;
   std::mutex AnalysisResultsMutex;
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index f71866cd07548..14236e85e9c7b 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -353,7 +353,7 @@ class PacRetAnalysis
 public:
   std::vector
   getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF,
- const BitVector &UsedDirtyRegs) const {
+ const ArrayRef UsedDirtyRegs) const {
 if (RegsToTrackInstsFor.empty())
   return {};
 auto MaybeState = getStateAt(Ret);
@@ -362,7 +362,7 @@ class PacRetAnalysis
 const State &S = *MaybeState;
 // Due to aliasing registers, multiple registers may have been tracked.
 std::set LastWritingInsts;
-for (MCPhysReg TrackedReg : UsedDirtyRegs.set_bits()) {
+for (MCPhysReg TrackedReg : UsedDirtyRegs) {
   for (const MCInst *Inst : lastWritingInsts(S, TrackedReg))
 LastWritingInsts.insert(Inst);
 }
@@ -376,57 +376,81 @@ class PacRetAnalysis
   }
 };
 
+static std::shared_ptr tryCheckReturn(const BinaryContext &BC,
+  const MCInstReference &Inst,
+  const State &S) {
+  static const GadgetKind RetKind("non-protected ret found");
+  if (!BC.MIB->isReturn(Inst))
+return nullptr;
+
+  ErrorOr MaybeRetReg = BC.MIB->getRegUsedAsRetDest(Inst);
+  if (MaybeRetReg.getError()) {
+return std::make_shared(
+Inst, "Warning: pac-ret analysis could not analyze this return "
+  "instruction");
+  }
+  MCPhysReg RetReg = *MaybeRetReg;
+  LLVM_DEBUG({
+traceInst(BC, "Found RET inst", Inst);
+traceReg(BC, "RetReg", RetReg);
+traceReg(BC, "Authenticated reg", BC.MIB->getAuthenticatedReg(Inst));
+  });
+  if (BC.MIB->isAuthenticationOfReg(Inst, RetReg))
+return nullptr;
+  BitVector UsedDirtyRegs = S.NonAutClobRegs;
+  LLVM_DEBUG({ traceRegMask(BC, "NonAutClobRegs at Ret", UsedDirtyRegs); });
+  UsedDirtyRegs &= BC.MIB->getAliases(RetReg, /*OnlySmaller=*/true);
+  LLVM_DEBUG({ traceRegMask(BC, "Intersection with RetReg", UsedDirtyRegs); });
+  if (!UsedDirtyRegs.any())
+return nullptr;
+
+  return std::make_shared(RetKind,

[llvm-branch-commits] [clang] [llvm] [AMDGPU][Attributor] Rework update of `AAAMDWavesPerEU` (PR #123995)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits


@@ -1336,6 +1311,59 @@ static void addPreloadKernArgHint(Function &F, 
TargetMachine &TM) {
   }
 }
 
+static void checkWavesPerEU(Module &M, TargetMachine &TM) {
+  for (Function &F : M) {
+const GCNSubtarget &ST = TM.getSubtarget(F);
+
+auto FlatWgrpSizeAttr =
+AMDGPU::getIntegerPairAttribute(F, "amdgpu-flat-work-group-size");
+auto WavesPerEUAttr = AMDGPU::getIntegerPairAttribute(
+F, "amdgpu-waves-per-eu", /*OnlyFirstRequired=*/true);
+
+unsigned MinWavesPerEU = ST.getMinWavesPerEU();
+unsigned MaxWavesPerEU = ST.getMaxWavesPerEU();
+
+unsigned MinFlatWgrpSize = 1U;
+unsigned MaxFlatWgrpSize = 1024U;
+if (FlatWgrpSizeAttr.has_value()) {
+  MinFlatWgrpSize = FlatWgrpSizeAttr->first;
+  MaxFlatWgrpSize = *(FlatWgrpSizeAttr->second);
+}

arsenm wrote:

```suggestion
if (FlatWgrpSizeAttr)
  std::tie(MinFlatWgrpSize, MaxFlatWgrpSize) = *FlatWgrpSizeAttr;
```

https://github.com/llvm/llvm-project/pull/123995
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CI] Exclude docs directories from triggering rebuilds (PR #133185)

2025-04-05 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/133185


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/20.x: [libcxx] [test] Fix restoring LLVM_DIR and Clang_DIR (#132838) (PR #133153)

2025-04-05 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/133153
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)

2025-04-05 Thread Kai Nacke via llvm-branch-commits


@@ -169,6 +169,91 @@ enum SubsectionKind : uint8_t {
   SK_PPA1 = 2,
   SK_PPA2 = 4,
 };
+
+// The standard System/390 convention is to name the high-order (leftmost) bit
+// in a byte as bit zero. The Flags type helps to set bits in byte according
+// to this numeration order.
+class Flags {
+  uint8_t Val;
+
+  constexpr static uint8_t bits(uint8_t BitIndex, uint8_t Length, uint8_t 
Value,
+uint8_t OldValue) {
+uint8_t Pos = 8 - BitIndex - Length;
+uint8_t Mask = ((1 << Length) - 1) << Pos;
+Value = Value << Pos;
+return (OldValue & ~Mask) | Value;
+  }
+
+public:
+  constexpr Flags() : Val(0) {}
+  constexpr Flags(uint8_t BitIndex, uint8_t Length, uint8_t Value)
+  : Val(bits(BitIndex, Length, Value, 0)) {}
+
+  template 
+  constexpr void set(uint8_t BitIndex, uint8_t Length, T NewValue) {
+Val = bits(BitIndex, Length, static_cast(NewValue), Val);
+  }
+
+  template 
+  constexpr T get(uint8_t BitIndex, uint8_t Length) const {
+return static_cast((Val >> (8 - BitIndex - Length)) &
+  ((1 << Length) - 1));
+  }
+
+  constexpr operator uint8_t() const { return Val; }
+};
+
+// Structure for the flag field of a symbol. See
+// 
https://www.ibm.com/docs/en/zos/3.1.0?topic=formats-external-symbol-definition-record,
+// offset 41, for the definition.
+struct SymbolFlags {
+  Flags SymFlags;
+
+#define GOFF_SYMBOL_FLAG(NAME, TYPE, BITINDEX, LENGTH) 
\
+  void set##NAME(TYPE Val) { SymFlags.set(BITINDEX, LENGTH, Val); }  
\
+  TYPE get##NAME() const { return SymFlags.get(BITINDEX, LENGTH); }
+
+  GOFF_SYMBOL_FLAG(FillBytePresence, bool, 0, 1)
+  GOFF_SYMBOL_FLAG(Mangled, bool, 1, 1)
+  GOFF_SYMBOL_FLAG(Renameable, bool, 2, 1)
+  GOFF_SYMBOL_FLAG(RemovableClass, bool, 3, 1)
+  GOFF_SYMBOL_FLAG(ReservedQwords, ESDReserveQwords, 5, 3)
+
+#undef GOFF_SYMBOL_FLAG
+
+constexpr operator uint8_t() const { return static_cast(SymFlags); }

redstar wrote:

Changed.

https://github.com/llvm/llvm-project/pull/133799
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [LoongArch] Pre-commit test for fixing tls-le symbol type (PR #132361)

2025-04-05 Thread Lu Weining via llvm-branch-commits

https://github.com/SixWeining approved this pull request.


https://github.com/llvm/llvm-project/pull/132361
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for select (PR #132384)

2025-04-05 Thread Petar Avramovic via llvm-branch-commits

https://github.com/petar-avramovic ready_for_review 
https://github.com/llvm/llvm-project/pull/132384
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [X86] When expanding LCMPXCHG16B_SAVE_RBX, substitute RBX in base (#134109) (PR #134331)

2025-04-05 Thread Aaron Puchert via llvm-branch-commits

aaronpuchert wrote:

You might have to (formally) approve the changes.

https://github.com/llvm/llvm-project/pull/134331
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] llvm-reduce: Try to preserve instruction metadata as argument attributes (PR #133557)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/133557
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [ctxprof] Support for "move" semantics for the contextual root (PR #134192)

2025-04-05 Thread Mircea Trofin via llvm-branch-commits

https://github.com/mtrofin updated 
https://github.com/llvm/llvm-project/pull/134192

>From f9b3bfa82d671dc4f67001762e28bb57ea154ebf Mon Sep 17 00:00:00 2001
From: Mircea Trofin 
Date: Wed, 2 Apr 2025 18:39:14 -0700
Subject: [PATCH] [ctxprof] Support for "move" semantics for the contextual
 root

---
 .../Transforms/Utils/FunctionImportUtils.h| 25 
 llvm/lib/Transforms/IPO/FunctionImport.cpp| 18 
 .../Transforms/Utils/FunctionImportUtils.cpp  | 29 ++-
 .../ThinLTO/X86/ctxprof-separate-module.ll| 22 --
 4 files changed, 70 insertions(+), 24 deletions(-)

diff --git a/llvm/include/llvm/Transforms/Utils/FunctionImportUtils.h 
b/llvm/include/llvm/Transforms/Utils/FunctionImportUtils.h
index 6d83b615d5f13..28ba20bc18cf9 100644
--- a/llvm/include/llvm/Transforms/Utils/FunctionImportUtils.h
+++ b/llvm/include/llvm/Transforms/Utils/FunctionImportUtils.h
@@ -97,29 +97,14 @@ class FunctionImportGlobalProcessing {
   /// linkage for a required promotion of a local to global scope.
   GlobalValue::LinkageTypes getLinkage(const GlobalValue *SGV, bool DoPromote);
 
+  /// The symbols with these names are moved to a different module and should 
be
+  /// promoted to external linkage where they are defined.
+  DenseSet SymbolsToMove;
+
 public:
   FunctionImportGlobalProcessing(Module &M, const ModuleSummaryIndex &Index,
  SetVector *GlobalsToImport,
- bool ClearDSOLocalOnDeclarations)
-  : M(M), ImportIndex(Index), GlobalsToImport(GlobalsToImport),
-ClearDSOLocalOnDeclarations(ClearDSOLocalOnDeclarations) {
-// If we have a ModuleSummaryIndex but no function to import,
-// then this is the primary module being compiled in a ThinLTO
-// backend compilation, and we need to see if it has functions that
-// may be exported to another backend compilation.
-if (!GlobalsToImport)
-  HasExportedFunctions = ImportIndex.hasExportedFunctions(M);
-
-#ifndef NDEBUG
-SmallVector Vec;
-// First collect those in the llvm.used set.
-collectUsedGlobalVariables(M, Vec, /*CompilerUsed=*/false);
-// Next collect those in the llvm.compiler.used set.
-collectUsedGlobalVariables(M, Vec, /*CompilerUsed=*/true);
-Used = {llvm::from_range, Vec};
-#endif
-  }
-
+ bool ClearDSOLocalOnDeclarations);
   void run();
 };
 
diff --git a/llvm/lib/Transforms/IPO/FunctionImport.cpp 
b/llvm/lib/Transforms/IPO/FunctionImport.cpp
index 3d9fb7b12b5d5..50100a63cf407 100644
--- a/llvm/lib/Transforms/IPO/FunctionImport.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionImport.cpp
@@ -182,6 +182,15 @@ static cl::opt CtxprofMoveRootsToOwnModule(
  "their own module."),
 cl::Hidden, cl::init(false));
 
+cl::list MoveSymbolGUID(
+"thinlto-move-symbols",
+cl::desc(
+"Move the symbols with the given name. This will delete these symbols "
+"wherever they are originally defined, and make sure their "
+"linkage is External where they are imported. It is meant to be "
+"used with the name of contextual profiling roots."),
+cl::Hidden);
+
 namespace llvm {
 extern cl::opt EnableMemProfContextDisambiguation;
 }
@@ -1858,6 +1867,15 @@ Expected FunctionImporter::importFunctions(
   LLVM_DEBUG(dbgs() << "Starting import for Module "
 << DestModule.getModuleIdentifier() << "\n");
   unsigned ImportedCount = 0, ImportedGVCount = 0;
+  // Before carrying out any imports, see if this module defines functions in
+  // MoveSymbolGUID. If it does, delete them here (but leave the declaration).
+  // The function will be imported elsewhere, as extenal linkage, and the
+  // destination doesn't yet have its definition.
+  DenseSet MoveSymbolGUIDSet;
+  MoveSymbolGUIDSet.insert_range(MoveSymbolGUID);
+  for (auto &F : DestModule)
+if (!F.isDeclaration() && MoveSymbolGUIDSet.contains(F.getGUID()))
+  F.deleteBody();
 
   IRMover Mover(DestModule);
 
diff --git a/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp 
b/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp
index ae1af943bc11c..81e461e28df17 100644
--- a/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp
+++ b/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp
@@ -24,6 +24,31 @@ static cl::opt UseSourceFilenameForPromotedLocals(
  "This requires that the source filename has a unique name / "
  "path to avoid name collisions."));
 
+extern cl::list MoveSymbolGUID;
+
+FunctionImportGlobalProcessing::FunctionImportGlobalProcessing(
+Module &M, const ModuleSummaryIndex &Index,
+SetVector *GlobalsToImport, bool 
ClearDSOLocalOnDeclarations)
+: M(M), ImportIndex(Index), GlobalsToImport(GlobalsToImport),
+  ClearDSOLocalOnDeclarations(ClearDSOLocalOnDeclarations) {
+  // If we have a ModuleSummaryIndex but no function to import,
+  // then this is the primary module being compiled in a ThinLTO
+  // backen

[llvm-branch-commits] [compiler-rt] [llvm] [ctxprof] Track unhandled call targets (PR #131417)

2025-04-05 Thread Mircea Trofin via llvm-branch-commits


@@ -265,7 +275,16 @@ Error llvm::createCtxProfFromYAML(StringRef Profile, 
raw_ostream &Out) {
   if (!TopList)
 return createStringError(
 "Unexpected error converting internal structure to ctx profile");
-  Writer.writeContextual(*TopList, DC.TotalRootEntryCount);
+
+  ctx_profile::ContextNode *FirstUnhandled = nullptr;
+  for (const auto &U : DC.Unhandled) {
+SerializableCtxRepresentation Unhandled;
+Unhandled.Guid = U.first;
+Unhandled.Counters.insert(Unhandled.Counters.begin(), U.second.begin(),

mtrofin wrote:

wdym. this copies the counter values from `U.second` to `Unhandled`. `insert` 
with a first/last iterator pair will - afaik - allocate the diff (last - first) 
(caveats about difference being computable, but the source is also a vector). 
Also why would I reverse?

There *is* a more efficient alternative - having a `createNode` that takes the 
guid and the counters separately - but not quite worth it for the yaml 
converter, it's a test utility.

https://github.com/llvm/llvm-project/pull/131417
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lldb] release/20.x: [lldb] Use correct path for lldb-server executable (#131519) (PR #134072)

2025-04-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/134072

Backport 945c494e2c3c078e26ff521ef3e9455e0ff764ac

Requested by: @DavidSpickett

>From c8c12d84c18a6ebd10151c2e354002a8b6642af3 Mon Sep 17 00:00:00 2001
From: Yuval Deutscher 
Date: Mon, 31 Mar 2025 18:20:40 +0300
Subject: [PATCH] [lldb] Use correct path for lldb-server executable (#131519)

Hey,

This solves an issue where running lldb-server-20 with a non-absolute
path (for example, when it's installed into `/usr/bin` and the user runs
it as `lldb-server-20 ...` and not `/usr/bin/lldb-server-20 ...`) fails
with `error: spawn_process failed: execve failed: No such file or
directory`. The underlying issue is that when run that way, it attempts
to execute a binary named `lldb-server-20` from its current directory.
This is also a mild security hazard because lldb-server is often being
run as root in the directory /tmp, meaning that an unprivileged user can
create the file /tmp/lldb-server-20 and lldb-server will execute it as
root. (although, well, it's a debugging server we're talking about, so
that may not be a real concern)

I haven't previously contributed to this project; if you want me to
change anything in the code please don't hesitate to let me know.

(cherry picked from commit 945c494e2c3c078e26ff521ef3e9455e0ff764ac)
---
 lldb/tools/lldb-server/lldb-platform.cpp | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/lldb/tools/lldb-server/lldb-platform.cpp 
b/lldb/tools/lldb-server/lldb-platform.cpp
index 880b45b989b9c..51174a0f443c3 100644
--- a/lldb/tools/lldb-server/lldb-platform.cpp
+++ b/lldb/tools/lldb-server/lldb-platform.cpp
@@ -31,6 +31,7 @@
 #include "Plugins/Process/gdb-remote/ProcessGDBRemoteLog.h"
 #include "lldb/Host/ConnectionFileDescriptor.h"
 #include "lldb/Host/HostGetOpt.h"
+#include "lldb/Host/HostInfo.h"
 #include "lldb/Host/MainLoop.h"
 #include "lldb/Host/OptionParser.h"
 #include "lldb/Host/Socket.h"
@@ -256,8 +257,9 @@ static void 
client_handle(GDBRemoteCommunicationServerPlatform &platform,
   printf("Disconnected.\n");
 }
 
-static Status spawn_process(const char *progname, const Socket *conn_socket,
-uint16_t gdb_port, const lldb_private::Args &args,
+static Status spawn_process(const char *progname, const FileSpec &prog,
+const Socket *conn_socket, uint16_t gdb_port,
+const lldb_private::Args &args,
 const std::string &log_file,
 const StringRef log_channels, MainLoop &main_loop) 
{
   Status error;
@@ -267,9 +269,10 @@ static Status spawn_process(const char *progname, const 
Socket *conn_socket,
 
   ProcessLaunchInfo launch_info;
 
-  FileSpec self_spec(progname, FileSpec::Style::native);
-  launch_info.SetExecutableFile(self_spec, true);
+  launch_info.SetExecutableFile(prog, false);
+  launch_info.SetArg0(progname);
   Args &self_args = launch_info.GetArguments();
+  self_args.AppendArgument(progname);
   self_args.AppendArgument(llvm::StringRef("platform"));
   self_args.AppendArgument(llvm::StringRef("--child-platform-fd"));
   self_args.AppendArgument(llvm::to_string(shared_socket.GetSendableFD()));
@@ -551,9 +554,10 @@ int main_platform(int argc, char *argv[]) {
 log_channels, &main_loop,
 &platform_handles](std::unique_ptr sock_up) {
   printf("Connection established.\n");
-  Status error = spawn_process(progname, sock_up.get(),
-   gdbserver_port, inferior_arguments,
-   log_file, log_channels, main_loop);
+  Status error = spawn_process(
+  progname, HostInfo::GetProgramFileSpec(), sock_up.get(),
+  gdbserver_port, inferior_arguments, log_file, log_channels,
+  main_loop);
   if (error.Fail()) {
 Log *log = GetLog(LLDBLog::Platform);
 LLDB_LOGF(log, "spawn_process failed: %s", error.AsCString());

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] llvm-reduce: Fix losing fast math flags in operands-to-args (PR #133421)

2025-04-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/133421

>From a59ef1fe4845b29caf23bf27a2ad1343bc94d188 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 28 Mar 2025 18:00:05 +0700
Subject: [PATCH] llvm-reduce: Fix losing fast math flags in operands-to-args

---
 .../operands-to-args-preserve-fmf.ll  | 20 +++
 .../deltas/ReduceOperandsToArgs.cpp   |  4 
 2 files changed, 24 insertions(+)
 create mode 100644 llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll

diff --git a/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll 
b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll
new file mode 100644
index 0..b4b19ca28dbb5
--- /dev/null
+++ b/llvm/test/tools/llvm-reduce/operands-to-args-preserve-fmf.ll
@@ -0,0 +1,20 @@
+; RUN: llvm-reduce %s -o %t --abort-on-invalid-reduction 
--delta-passes=operands-to-args --test FileCheck --test-arg %s --test-arg 
--check-prefix=INTERESTING --test-arg --input-file
+; RUN: FileCheck %s --input-file %t --check-prefix=REDUCED
+
+; INTERESTING-LABEL: define float @callee(
+; INTERESTING: fadd float
+define float @callee(float %a) {
+  %x = fadd float %a, 1.0
+  ret float %x
+}
+
+; INTERESTING-LABEL: define float @caller(
+; INTERESTING: load float
+
+; REDUCED-LABEL: define float @caller(ptr %ptr, float %val, float 
%callee.ret1) {
+; REDUCED: %callee.ret12 = call nnan nsz float @callee(float %val, float 
0.00e+00)
+define float @caller(ptr %ptr) {
+  %val = load float, ptr %ptr
+  %callee.ret = call nnan nsz float @callee(float %val)
+  ret float %callee.ret
+}
diff --git a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp 
b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp
index 037ff15fae0f6..e7ad52eb65a5d 100644
--- a/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp
+++ b/llvm/tools/llvm-reduce/deltas/ReduceOperandsToArgs.cpp
@@ -14,6 +14,7 @@
 #include "llvm/IR/InstIterator.h"
 #include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/Instructions.h"
+#include "llvm/IR/Operator.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
 #include "llvm/Transforms/Utils/Cloning.h"
 
@@ -107,6 +108,9 @@ static void replaceFunctionCalls(Function *OldF, Function 
*NewF) {
 NewCI->setCallingConv(NewF->getCallingConv());
 NewCI->setAttributes(CI->getAttributes());
 
+if (auto *FPOp = dyn_cast(NewCI))
+  NewCI->setFastMathFlags(CI->getFastMathFlags());
+
 // Do the replacement for this use.
 if (!CI->use_empty())
   CI->replaceAllUsesWith(NewCI);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [ctxprof][nfc] Move 2 implementation functions up in `CtxInstrProfiling.cpp` (PR #133146)

2025-04-05 Thread Mircea Trofin via llvm-branch-commits

https://github.com/mtrofin updated 
https://github.com/llvm/llvm-project/pull/133146

>From 5579f73a4ad3d8205608eecde962257077578685 Mon Sep 17 00:00:00 2001
From: Mircea Trofin 
Date: Wed, 26 Mar 2025 10:10:43 -0700
Subject: [PATCH] [ctxprof][nfc] Move 2 implementation functions up in
 `CtxInstrProfiling.cpp`

---
 .../lib/ctx_profile/CtxInstrProfiling.cpp | 66 +--
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp 
b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
index b0e63a8861d86..da291e0bbabdd 100644
--- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
+++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp
@@ -244,6 +244,39 @@ ContextNode *getFlatProfile(FunctionData &Data, GUID Guid,
   return Data.FlatCtx;
 }
 
+// This should be called once for a Root. Allocate the first arena, set up the
+// first context.
+void setupContext(ContextRoot *Root, GUID Guid, uint32_t NumCounters,
+  uint32_t NumCallsites) {
+  __sanitizer::GenericScopedLock<__sanitizer::SpinMutex> Lock(
+  &AllContextsMutex);
+  // Re-check - we got here without having had taken a lock.
+  if (Root->FirstMemBlock)
+return;
+  const auto Needed = ContextNode::getAllocSize(NumCounters, NumCallsites);
+  auto *M = Arena::allocateNewArena(getArenaAllocSize(Needed));
+  Root->FirstMemBlock = M;
+  Root->CurrentMem = M;
+  Root->FirstNode = allocContextNode(M->tryBumpAllocate(Needed), Guid,
+ NumCounters, NumCallsites);
+  AllContextRoots.PushBack(Root);
+}
+
+ContextRoot *FunctionData::getOrAllocateContextRoot() {
+  auto *Root = CtxRoot;
+  if (Root)
+return Root;
+  __sanitizer::GenericScopedLock<__sanitizer::StaticSpinMutex> L(&Mutex);
+  Root = CtxRoot;
+  if (!Root) {
+Root = new (__sanitizer::InternalAlloc(sizeof(ContextRoot))) ContextRoot();
+CtxRoot = Root;
+  }
+
+  assert(Root);
+  return Root;
+}
+
 ContextNode *getUnhandledContext(FunctionData &Data, GUID Guid,
  uint32_t NumCounters) {
 
@@ -333,39 +366,6 @@ ContextNode *__llvm_ctx_profile_get_context(FunctionData 
*Data, void *Callee,
   return Ret;
 }
 
-// This should be called once for a Root. Allocate the first arena, set up the
-// first context.
-void setupContext(ContextRoot *Root, GUID Guid, uint32_t NumCounters,
-  uint32_t NumCallsites) {
-  __sanitizer::GenericScopedLock<__sanitizer::SpinMutex> Lock(
-  &AllContextsMutex);
-  // Re-check - we got here without having had taken a lock.
-  if (Root->FirstMemBlock)
-return;
-  const auto Needed = ContextNode::getAllocSize(NumCounters, NumCallsites);
-  auto *M = Arena::allocateNewArena(getArenaAllocSize(Needed));
-  Root->FirstMemBlock = M;
-  Root->CurrentMem = M;
-  Root->FirstNode = allocContextNode(M->tryBumpAllocate(Needed), Guid,
- NumCounters, NumCallsites);
-  AllContextRoots.PushBack(Root);
-}
-
-ContextRoot *FunctionData::getOrAllocateContextRoot() {
-  auto *Root = CtxRoot;
-  if (Root)
-return Root;
-  __sanitizer::GenericScopedLock<__sanitizer::StaticSpinMutex> L(&Mutex);
-  Root = CtxRoot;
-  if (!Root) {
-Root = new (__sanitizer::InternalAlloc(sizeof(ContextRoot))) ContextRoot();
-CtxRoot = Root;
-  }
-
-  assert(Root);
-  return Root;
-}
-
 ContextNode *__llvm_ctx_profile_start_context(
 FunctionData *FData, GUID Guid, uint32_t Counters,
 uint32_t Callsites) SANITIZER_NO_THREAD_SAFETY_ANALYSIS {

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] compiler-rt: Introduce runtime functions for emulated PAC. (PR #133530)

2025-04-05 Thread Kristof Beyls via llvm-branch-commits


@@ -0,0 +1,7343 @@
+/*
+ * xxHash - Extremely Fast Hash algorithm
+ * Header File
+ * Copyright (C) 2012-2023 Yann Collet
+ *
+ * BSD 2-Clause License (https://www.opensource.org/licenses/bsd-license.php)

kbeyls wrote:

This is a license different from Apache-2.0 WITH LLVM-exception.
Therefore, the process described at 
https://llvm.org/docs/DeveloperPolicy.html#copyright-license-and-patents should 
be followed to check whether this is acceptable in this specific case.

That being said, xxhash is already present under 
[llvm/lib/Support/xxhash.cpp](https://github.com/llvm/llvm-project/blob/21eeca3db0341fef4ab4a6464ffe38b2eba5810c/llvm/lib/Support/xxhash.cpp#L163),
 as you pointed out in the 
[RFC](https://discourse.llvm.org/t/rfc-emulated-pac/85557).

Making sure we don't have multiple copies of non-Apache-2.0 WITH LLVM-exception 
code would be preferable. I'll tag @beanz, as he had ideas about how to better 
structure vendored third party code in LLVM.

I'm not sure if moving non-Apache-2.0 WITH LLVM-exception licensed code to a 
run-time library (for the first time?) triggers new concerns.

I'll note that other hashing algorithms, such as 
[Blake3](https://github.com/llvm/llvm-project/blob/21eeca3db0341fef4ab4a6464ffe38b2eba5810c/llvm/include/llvm/Support/BLAKE3.h#L1)
 and 
[SipHash](https://github.com/llvm/llvm-project/blob/21eeca3db0341fef4ab4a6464ffe38b2eba5810c/llvm/lib/Support/SipHash.cpp#L1),
 which are available under the Apache-2.0 WITH LLVM-exception, are also already 
present in the LLVM Support library.
Would one of these preferably-licensed hashing algorithms be a good fit for the 
hashing functionality needed for this use case?



https://github.com/llvm/llvm-project/pull/133530
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >